Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning the meanings of function words from grounded language using a visual question answering model

Aug 16, 2023

Eva Portelance, Michael C. Frank, Dan Jurafsky

Figure 1 for Learning the meanings of function words from grounded language using a visual question answering model

Figure 2 for Learning the meanings of function words from grounded language using a visual question answering model

Figure 3 for Learning the meanings of function words from grounded language using a visual question answering model

Figure 4 for Learning the meanings of function words from grounded language using a visual question answering model

Share this with someone who'll enjoy it:

Abstract:Interpreting a seemingly-simple function word like "or", "behind", or "more" can require logical, numerical, and relational reasoning. How are such words learned by children? Prior acquisition theories have often relied on positing a foundation of innate knowledge. Yet recent neural-network based visual question answering models apparently can learn to use function words as part of answering questions about complex visual scenes. In this paper, we study what these models learn about function words, in the hope of better understanding how the meanings of these words can be learnt by both models and children. We show that recurrent models trained on visually grounded language learn gradient semantics for function words requiring spacial and numerical reasoning. Furthermore, we find that these models can learn the meanings of logical connectives "and" and "or" without any prior knowledge of logical reasoning, as well as early evidence that they can develop the ability to reason about alternative expressions when interpreting language. Finally, we show that word learning difficulty is dependent on frequency in models' input. Our findings offer evidence that it is possible to learn the meanings of function words in visually grounded context by using non-symbolic general statistical learning algorithms, without any prior knowledge of linguistic meaning.

View paper on

Share this with someone who'll enjoy it:

Title:Learning the meanings of function words from grounded language using a visual question answering model

Paper and Code