Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Can RNNs trained on harder subject-verb agreement instances still perform well on easier ones?

Oct 10, 2020

Hritik Bansal, Gantavya Bhatt, Sumeet Agarwal

Figure 1 for Can RNNs trained on harder subject-verb agreement instances still perform well on easier ones?

Figure 2 for Can RNNs trained on harder subject-verb agreement instances still perform well on easier ones?

Figure 3 for Can RNNs trained on harder subject-verb agreement instances still perform well on easier ones?

Figure 4 for Can RNNs trained on harder subject-verb agreement instances still perform well on easier ones?

Share this with someone who'll enjoy it:

Abstract:The main subject and the associated verb in English must agree in grammatical number as per the Subject-Verb Agreement (SVA) phenomenon. It has been found that the presence of a noun between the verb and the main subject, whose grammatical number is opposite to that of the main subject, can cause speakers to produce a verb that agrees with the intervening noun rather than the main noun; the former thus acts as an agreement attractor. Such attractors have also been shown to pose a challenge for RNN models without explicit hierarchical bias to perform well on SVA tasks. Previous work suggests that syntactic cues in the input can aid such models to choose hierarchical rules over linear rules for number agreement. In this work, we investigate the effects of the choice of training data, training algorithm, and architecture on hierarchical generalization. We observe that the models under consideration fail to perform well on sentences with no agreement attractor when trained solely on natural sentences with at least one attractor. Even in the presence of this biased training set, implicit hierarchical bias in the architecture (as in the Ordered Neurons LSTM) is not enough to capture syntax-sensitive dependencies. These results suggest that current RNNs do not capture the underlying hierarchical rules of natural language, but rather use shallower heuristics for their predictions.

* 15 pages, 3 figures, 13 Tables (including Appendix); Submitted for review

View paper on

Share this with someone who'll enjoy it:

Title:Can RNNs trained on harder subject-verb agreement instances still perform well on easier ones?

Paper and Code