Abstract:This study explores how recent large language models (LLMs) navigate relative clause attachment {ambiguity} and use world knowledge biases for disambiguation in six typologically diverse languages: English, Chinese, Japanese, Korean, Russian, and Spanish. We describe the process of creating a novel dataset -- MultiWho -- for fine-grained evaluation of relative clause attachment preferences in ambiguous and unambiguous contexts. Our experiments with three LLMs indicate that, contrary to humans, LLMs consistently exhibit a preference for local attachment, displaying limited responsiveness to syntactic variations or language-specific attachment patterns. Although LLMs performed well in unambiguous cases, they rigidly prioritized world knowledge biases, lacking the flexibility of human language processing. These findings highlight the need for more diverse, pragmatically nuanced multilingual training to improve LLMs' handling of complex structures and human-like comprehension.
Abstract:This study examines how large language models (LLMs) resolve relative clause (RC) attachment ambiguities and compares their performance to human sentence processing. Focusing on two linguistic factors, namely the length of RCs and the syntactic position of complex determiner phrases (DPs), we assess whether LLMs can achieve human-like interpretations amid the complexities of language. In this study, we evaluated several LLMs, including Claude, Gemini and Llama, in multiple languages: English, Spanish, French, German, Japanese, and Korean. While these models performed well in Indo-European languages (English, Spanish, French, and German), they encountered difficulties in Asian languages (Japanese and Korean), often defaulting to incorrect English translations. The findings underscore the variability in LLMs' handling of linguistic ambiguities and highlight the need for model improvements, particularly for non-European languages. This research informs future enhancements in LLM design to improve accuracy and human-like processing in diverse linguistic environments.