Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xixian Liao

MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation

Dec 16, 2024

Javier García Gilabert, Carlos Escolano, Audrey Mash, Xixian Liao, Maite Melero

Figure 1 for MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation

Figure 2 for MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation

Figure 3 for MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation

Figure 4 for MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation

Abstract:We introduce MT-LENS, a framework designed to evaluate Machine Translation (MT) systems across a variety of tasks, including translation quality, gender bias detection, added toxicity, and robustness to misspellings. While several toolkits have become very popular for benchmarking the capabilities of Large Language Models (LLMs), existing evaluation tools often lack the ability to thoroughly assess the diverse aspects of MT performance. MT-LENS addresses these limitations by extending the capabilities of LM-eval-harness for MT, supporting state-of-the-art datasets and a wide range of evaluation metrics. It also offers a user-friendly platform to compare systems and analyze translations with interactive visualizations. MT-LENS aims to broaden access to evaluation strategies that go beyond traditional translation quality evaluation, enabling researchers and engineers to better understand the performance of a NMT model and also easily measure system's biases.

* 6 pages, 2 figures

Via

Access Paper or Ask Questions

Investigating the translation capabilities of Large Language Models trained on parallel data only

Jun 13, 2024

Javier García Gilabert, Carlos Escolano, Aleix Sant Savall, Francesca De Luca Fornaciari, Audrey Mash, Xixian Liao, Maite Melero

Figure 1 for Investigating the translation capabilities of Large Language Models trained on parallel data only

Figure 2 for Investigating the translation capabilities of Large Language Models trained on parallel data only

Figure 3 for Investigating the translation capabilities of Large Language Models trained on parallel data only

Figure 4 for Investigating the translation capabilities of Large Language Models trained on parallel data only

Abstract:In recent years, Large Language Models (LLMs) have demonstrated exceptional proficiency across a broad spectrum of Natural Language Processing (NLP) tasks, including Machine Translation. However, previous methods predominantly relied on iterative processes such as instruction fine-tuning or continual pre-training, leaving unexplored the challenges of training LLMs solely on parallel data. In this work, we introduce PLUME (Parallel Language Model), a collection of three 2B LLMs featuring varying vocabulary sizes (32k, 128k, and 256k) trained exclusively on Catalan-centric parallel examples. These models perform comparably to previous encoder-decoder architectures on 16 supervised translation directions and 56 zero-shot ones. Utilizing this set of models, we conduct a thorough investigation into the translation capabilities of LLMs, probing their performance, the impact of the different elements of the prompt, and their cross-lingual representation space.

* We release our code at: https://github.com/projecte-aina/Plume

Via

Access Paper or Ask Questions

The Impact of Familiarity on Naming Variation: A Study on Object Naming in Mandarin Chinese

Nov 16, 2023

Yunke He, Xixian Liao, Jialing Liang, Gemma Boleda

Figure 1 for The Impact of Familiarity on Naming Variation: A Study on Object Naming in Mandarin Chinese

Figure 2 for The Impact of Familiarity on Naming Variation: A Study on Object Naming in Mandarin Chinese

Figure 3 for The Impact of Familiarity on Naming Variation: A Study on Object Naming in Mandarin Chinese

Figure 4 for The Impact of Familiarity on Naming Variation: A Study on Object Naming in Mandarin Chinese

Abstract:Different speakers often produce different names for the same object or entity (e.g., "woman" vs. "tourist" for a female tourist). The reasons behind variation in naming are not well understood. We create a Language and Vision dataset for Mandarin Chinese that provides an average of 20 names for 1319 naturalistic images, and investigate how familiarity with a given kind of object relates to the degree of naming variation it triggers across subjects. We propose that familiarity influences naming variation in two competing ways: increasing familiarity can either expand vocabulary, leading to higher variation, or promote convergence on conventional names, thereby reducing variation. We find evidence for both factors being at play. Our study illustrates how computational resources can be used to address research questions in Cognitive Science.

Via

Access Paper or Ask Questions

Does referent predictability affect the choice of referential form? A computational approach using masked coreference resolution

Sep 27, 2021

Laura Aina, Xixian Liao, Gemma Boleda, Matthijs Westera

Figure 1 for Does referent predictability affect the choice of referential form? A computational approach using masked coreference resolution

Figure 2 for Does referent predictability affect the choice of referential form? A computational approach using masked coreference resolution

Figure 3 for Does referent predictability affect the choice of referential form? A computational approach using masked coreference resolution

Figure 4 for Does referent predictability affect the choice of referential form? A computational approach using masked coreference resolution

Abstract:It is often posited that more predictable parts of a speaker's meaning tend to be made less explicit, for instance using shorter, less informative words. Studying these dynamics in the domain of referring expressions has proven difficult, with existing studies, both psycholinguistic and corpus-based, providing contradictory results. We test the hypothesis that speakers produce less informative referring expressions (e.g., pronouns vs. full noun phrases) when the context is more informative about the referent, using novel computational estimates of referent predictability. We obtain these estimates training an existing coreference resolution system for English on a new task, masked coreference resolution, giving us a probability distribution over referents that is conditioned on the context but not the referring expression. The resulting system retains standard coreference resolution performance while yielding a better estimate of human-derived referent predictability than previous attempts. A statistical analysis of the relationship between model output and mention form supports the hypothesis that predictability affects the form of a mention, both its morphosyntactic type and its length.

Via

Access Paper or Ask Questions