Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sophie Hao

Generative Linguistics, Large Language Models, and the Social Nature of Scientific Success

Mar 25, 2025

Sophie Hao

Abstract:Chesi's (forthcoming) target paper depicts a generative linguistics in crisis, foreboded by Piantadosi's (2023) declaration that "modern language models refute Chomsky's approach to language." In order to survive, Chesi warns, generativists must hold themselves to higher standards of formal and empirical rigor. This response argues that the crisis described by Chesi and Piantadosi actually has little to do with rigor, but is rather a reflection of generativists' limited social ambitions. Chesi ties the fate of generative linguistics to its intellectual merits, but the current success of language model research is social in nature as much as it is intellectual. In order to thrive, then, generativists must do more than heed Chesi's call for rigor; they must also expand their ambitions by giving outsiders a stake in their future success.

* To appear in the Italian Journal of Linguistics. This is a response to Chesi (2024): arXiv:2412.12797

Via

Access Paper or Ask Questions

What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length

Nov 04, 2024

Lindia Tjuatja, Graham Neubig, Tal Linzen, Sophie Hao

Figure 1 for What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length

Figure 2 for What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length

Figure 3 for What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length

Figure 4 for What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length

Abstract:When comparing the linguistic capabilities of language models (LMs) with humans using LM probabilities, factors such as the length of the sequence and the unigram frequency of lexical items have a significant effect on LM probabilities in ways that humans are largely robust to. Prior works in comparing LM and human acceptability judgments treat these effects uniformly across models, making a strong assumption that models require the same degree of adjustment to control for length and unigram frequency effects. We propose MORCELA, a new linking theory between LM scores and acceptability judgments where the optimal level of adjustment for these effects is estimated from data via learned parameters for length and unigram frequency. We first show that MORCELA outperforms a commonly used linking theory for acceptability--SLOR (Pauls and Klein, 2012; Lau et al. 2017)--across two families of transformer LMs (Pythia and OPT). Furthermore, we demonstrate that the assumed degrees of adjustment in SLOR for length and unigram frequency overcorrect for these confounds, and that larger models require a lower relative degree of adjustment for unigram frequency, though a significant amount of adjustment is still necessary for all models. Finally, our subsequent analysis shows that larger LMs' lower susceptibility to frequency effects can be explained by an ability to better predict rarer words in context.

Via

Access Paper or Ask Questions

ERAS: Evaluating the Robustness of Chinese NLP Models to Morphological Garden Path Errors

Oct 16, 2024

Qinchan Li, Sophie Hao

Figure 1 for ERAS: Evaluating the Robustness of Chinese NLP Models to Morphological Garden Path Errors

Figure 2 for ERAS: Evaluating the Robustness of Chinese NLP Models to Morphological Garden Path Errors

Figure 3 for ERAS: Evaluating the Robustness of Chinese NLP Models to Morphological Garden Path Errors

Figure 4 for ERAS: Evaluating the Robustness of Chinese NLP Models to Morphological Garden Path Errors

Abstract:In languages without orthographic word boundaries, NLP models perform word segmentation, either as an explicit preprocessing step or as an implicit step in an end-to-end computation. This paper shows that Chinese NLP models are vulnerable to morphological garden path errors: errors caused by a failure to resolve local word segmentation ambiguities using sentence-level morphosyntactic context. We propose a benchmark, ERAS, that tests a model's vulnerability to morphological garden path errors by comparing its behavior on sentences with and without local segmentation ambiguities. Using ERAS, we show that word segmentation models make garden path errors on locally ambiguous sentences, but do not make equivalent errors on unambiguous sentences. We further show that sentiment analysis models with character-level tokenization make implicit garden path errors, even without an explicit word segmentation step in the pipeline. Our results indicate that models' segmentation of Chinese text often fails to account for morphosyntactic context.

* Under review in ARR/NAACL

Via

Access Paper or Ask Questions

Reflecting the Male Gaze: Quantifying Female Objectification in 19th and 20th Century Novels

Mar 25, 2024

Kexin Luo, Yue Mao, Bei Zhang, Sophie Hao

Abstract:Inspired by the concept of the male gaze (Mulvey, 1975) in literature and media studies, this paper proposes a framework for analyzing gender bias in terms of female objectification: the extent to which a text portrays female individuals as objects of visual pleasure. Our framework measures female objectification along two axes. First, we compute an agency bias score that indicates whether male entities are more likely to appear in the text as grammatical agents than female entities. Next, by analyzing the word embedding space induced by a text (Caliskan et al., 2017), we compute an appearance bias score that indicates whether female entities are more closely associated with appearance-related words than male entities. Applying our framework to 19th and 20th century novels reveals evidence of female objectification in literature: we find that novels written from a male perspective systematically objectify female characters, while novels written from a female perspective do not exhibit statistically significant objectification of any gender.

* To appear in LREC-COLING 2024

Via

Access Paper or Ask Questions

Verb Conjugation in Transformers Is Determined by Linear Encodings of Subject Number

Oct 23, 2023

Sophie Hao, Tal Linzen

Abstract:Deep architectures such as Transformers are sometimes criticized for having uninterpretable "black-box" representations. We use causal intervention analysis to show that, in fact, some linguistic features are represented in a linear, interpretable format. Specifically, we show that BERT's ability to conjugate verbs relies on a linear encoding of subject number that can be manipulated with predictable effects on conjugation accuracy. This encoding is found in the subject position at the first layer and the verb position at the last layer, but distributed across positions at middle layers, particularly when there are multiple cues to subject number.

* To appear in Findings of the Association for Computational Linguistics: EMNLP 2023

Via

Access Paper or Ask Questions