Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Javier García Gilabert

MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation

Dec 16, 2024

Javier García Gilabert, Carlos Escolano, Audrey Mash, Xixian Liao, Maite Melero

Figure 1 for MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation

Figure 2 for MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation

Figure 3 for MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation

Figure 4 for MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation

Abstract:We introduce MT-LENS, a framework designed to evaluate Machine Translation (MT) systems across a variety of tasks, including translation quality, gender bias detection, added toxicity, and robustness to misspellings. While several toolkits have become very popular for benchmarking the capabilities of Large Language Models (LLMs), existing evaluation tools often lack the ability to thoroughly assess the diverse aspects of MT performance. MT-LENS addresses these limitations by extending the capabilities of LM-eval-harness for MT, supporting state-of-the-art datasets and a wide range of evaluation metrics. It also offers a user-friendly platform to compare systems and analyze translations with interactive visualizations. MT-LENS aims to broaden access to evaluation strategies that go beyond traditional translation quality evaluation, enabling researchers and engineers to better understand the performance of a NMT model and also easily measure system's biases.

* 6 pages, 2 figures

Via

Access Paper or Ask Questions

Investigating the translation capabilities of Large Language Models trained on parallel data only

Jun 13, 2024

Javier García Gilabert, Carlos Escolano, Aleix Sant Savall, Francesca De Luca Fornaciari, Audrey Mash, Xixian Liao, Maite Melero

Figure 1 for Investigating the translation capabilities of Large Language Models trained on parallel data only

Figure 2 for Investigating the translation capabilities of Large Language Models trained on parallel data only

Figure 3 for Investigating the translation capabilities of Large Language Models trained on parallel data only

Figure 4 for Investigating the translation capabilities of Large Language Models trained on parallel data only

Abstract:In recent years, Large Language Models (LLMs) have demonstrated exceptional proficiency across a broad spectrum of Natural Language Processing (NLP) tasks, including Machine Translation. However, previous methods predominantly relied on iterative processes such as instruction fine-tuning or continual pre-training, leaving unexplored the challenges of training LLMs solely on parallel data. In this work, we introduce PLUME (Parallel Language Model), a collection of three 2B LLMs featuring varying vocabulary sizes (32k, 128k, and 256k) trained exclusively on Catalan-centric parallel examples. These models perform comparably to previous encoder-decoder architectures on 16 supervised translation directions and 56 zero-shot ones. Utilizing this set of models, we conduct a thorough investigation into the translation capabilities of LLMs, probing their performance, the impact of the different elements of the prompt, and their cross-lingual representation space.

* We release our code at: https://github.com/projecte-aina/Plume

Via

Access Paper or Ask Questions

ReSeTOX: Re-learning attention weights for toxicity mitigation in machine translation

May 19, 2023

Javier García Gilabert, Carlos Escolano, Marta R. Costa-Jussà

Figure 1 for ReSeTOX: Re-learning attention weights for toxicity mitigation in machine translation

Figure 2 for ReSeTOX: Re-learning attention weights for toxicity mitigation in machine translation

Figure 3 for ReSeTOX: Re-learning attention weights for toxicity mitigation in machine translation

Figure 4 for ReSeTOX: Re-learning attention weights for toxicity mitigation in machine translation

Abstract:Our proposed method, ReSeTOX (REdo SEarch if TOXic), addresses the issue of Neural Machine Translation (NMT) generating translation outputs that contain toxic words not present in the input. The objective is to mitigate the introduction of toxic language without the need for re-training. In the case of identified added toxicity during the inference process, ReSeTOX dynamically adjusts the key-value self-attention weights and re-evaluates the beam search hypotheses. Experimental results demonstrate that ReSeTOX achieves a remarkable 57% reduction in added toxicity while maintaining an average translation quality of 99.5% across 164 languages.

Via

Access Paper or Ask Questions