Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:On the Evaluation Practices in Multilingual NLP: Can Machine Translation Offer an Alternative to Human Translations?

Jun 20, 2024

Rochelle Choenni, Sara Rajaee, Christof Monz, Ekaterina Shutova

Figure 1 for On the Evaluation Practices in Multilingual NLP: Can Machine Translation Offer an Alternative to Human Translations?

Figure 2 for On the Evaluation Practices in Multilingual NLP: Can Machine Translation Offer an Alternative to Human Translations?

Figure 3 for On the Evaluation Practices in Multilingual NLP: Can Machine Translation Offer an Alternative to Human Translations?

Figure 4 for On the Evaluation Practices in Multilingual NLP: Can Machine Translation Offer an Alternative to Human Translations?

Share this with someone who'll enjoy it:

Abstract:While multilingual language models (MLMs) have been trained on 100+ languages, they are typically only evaluated across a handful of them due to a lack of available test data in most languages. This is particularly problematic when assessing MLM's potential for low-resource and unseen languages. In this paper, we present an analysis of existing evaluation frameworks in multilingual NLP, discuss their limitations, and propose several directions for more robust and reliable evaluation practices. Furthermore, we empirically study to what extent machine translation offers a {reliable alternative to human translation} for large-scale evaluation of MLMs across a wide set of languages. We use a SOTA translation model to translate test data from 4 tasks to 198 languages and use them to evaluate three MLMs. We show that while the selected subsets of high-resource test languages are generally sufficiently representative of a wider range of high-resource languages, we tend to overestimate MLMs' ability on low-resource languages. Finally, we show that simpler baselines can achieve relatively strong performance without having benefited from large-scale multilingual pretraining.

View paper on

Share this with someone who'll enjoy it:

Title:On the Evaluation Practices in Multilingual NLP: Can Machine Translation Offer an Alternative to Human Translations?

Paper and Code