Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alessandro Raganato

SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared Task on Hallucinations and Related Observable Overgeneration Mistakes

Apr 16, 2025

Raúl Vázquez, Timothee Mickus, Elaine Zosa, Teemu Vahtola, Jörg Tiedemann, Aman Sinha, Vincent Segonne, Fernando Sánchez-Vega, Alessandro Raganato, Jindřich Libovický(+8 more)

Abstract:We present the Mu-SHROOM shared task which is focused on detecting hallucinations and other overgeneration mistakes in the output of instruction-tuned large language models (LLMs). Mu-SHROOM addresses general-purpose LLMs in 14 languages, and frames the hallucination detection problem as a span-labeling task. We received 2,618 submissions from 43 participating teams employing diverse methodologies. The large number of submissions underscores the interest of the community in hallucination detection. We present the results of the participating systems and conduct an empirical analysis to identify key factors contributing to strong performance in this task. We also emphasize relevant current challenges, notably the varying degree of hallucinations across languages and the high annotator disagreement when labeling hallucination spans.

* Mu-SHROOM is part of SemEval-2025 (Task 3). TBP: Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

Via

Access Paper or Ask Questions

How to Blend Concepts in Diffusion Models

Jul 19, 2024

Giorgio Longari, Lorenzo Olearo, Simone Melzi, Rafael Peñaloza, Alessandro Raganato

Abstract:For the last decade, there has been a push to use multi-dimensional (latent) spaces to represent concepts; and yet how to manipulate these concepts or reason with them remains largely unclear. Some recent methods exploit multiple latent representations and their connection, making this research question even more entangled. Our goal is to understand how operations in the latent space affect the underlying concepts. To that end, we explore the task of concept blending through diffusion models. Diffusion models are based on a connection between a latent representation of textual prompts and a latent space that enables image reconstruction and generation. This task allows us to try different text-based combination strategies, and evaluate easily through a visual analysis. Our conclusion is that concept blending through space manipulation is possible, although the best strategy depends on the context of the blend.

Via

Access Paper or Ask Questions

SemEval-2024 Shared Task 6: SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes

Mar 20, 2024

Timothee Mickus, Elaine Zosa, Raúl Vázquez, Teemu Vahtola, Jörg Tiedemann, Vincent Segonne, Alessandro Raganato, Marianna Apidianaki

Abstract:This paper presents the results of the SHROOM, a shared task focused on detecting hallucinations: outputs from natural language generation (NLG) systems that are fluent, yet inaccurate. Such cases of overgeneration put in jeopardy many NLG applications, where correctness is often mission-critical. The shared task was conducted with a newly constructed dataset of 4000 model outputs labeled by 5 annotators each, spanning 3 NLP tasks: machine translation, paraphrase generation and definition modeling. The shared task was tackled by a total of 58 different users grouped in 42 teams, out of which 27 elected to write a system description paper; collectively, they submitted over 300 prediction sets on both tracks of the shared task. We observe a number of key trends in how this approach was tackled -- many participants rely on a handful of model, and often rely either on synthetic data for fine-tuning or zero-shot prompting strategies. While a majority of the teams did outperform our proposed baseline system, the performances of top-scoring systems are still consistent with a random handling of the more challenging items.

* SemEval 2024 shared task. Pre-review version

Via

Access Paper or Ask Questions

MAMMOTH: Massively Multilingual Modular Open Translation @ Helsinki

Mar 12, 2024

Timothee Mickus, Stig-Arne Grönroos, Joseph Attieh, Michele Boggia, Ona De Gibert, Shaoxiong Ji, Niki Andreas Lopi, Alessandro Raganato, Raúl Vázquez, Jörg Tiedemann

Figure 1 for MAMMOTH: Massively Multilingual Modular Open Translation @ Helsinki

Figure 2 for MAMMOTH: Massively Multilingual Modular Open Translation @ Helsinki

Figure 3 for MAMMOTH: Massively Multilingual Modular Open Translation @ Helsinki

Figure 4 for MAMMOTH: Massively Multilingual Modular Open Translation @ Helsinki

Abstract:NLP in the age of monolithic large language models is approaching its limits in terms of size and information that can be handled. The trend goes to modularization, a necessary step into the direction of designing smaller sub-networks and components with specialized functionality. In this paper, we present the MAMMOTH toolkit: a framework designed for training massively multilingual modular machine translation systems at scale, initially derived from OpenNMT-py and then adapted to ensure efficient training across computation clusters. We showcase its efficiency across clusters of A100 and V100 NVIDIA GPUs, and discuss our design philosophy and plans for future information. The toolkit is publicly available online.

* Presented as a demo at EACL 2024

Via

Access Paper or Ask Questions

Democratizing Machine Translation with OPUS-MT

Dec 04, 2022

Jörg Tiedemann, Mikko Aulamo, Daria Bakshandaeva, Michele Boggia, Stig-Arne Grönroos, Tommi Nieminen, Alessandro Raganato, Yves Scherrer, Raul Vazquez, Sami Virpioja

Abstract:This paper presents the OPUS ecosystem with a focus on the development of open machine translation models and tools, and their integration into end-user applications, development platforms and professional workflows. We discuss our on-going mission of increasing language coverage and translation quality, and also describe on-going work on the development of modular translation models and speed-optimized compact solutions for real-time translation on regular desktops and small devices.

Via

Access Paper or Ask Questions

XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization

Oct 13, 2020

Alessandro Raganato, Tommaso Pasini, Jose Camacho-Collados, Mohammad Taher Pilehvar

Figure 1 for XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization

Figure 2 for XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization

Figure 3 for XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization

Figure 4 for XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization

Abstract:The ability to correctly model distinct meanings of a word is crucial for the effectiveness of semantic representation techniques. However, most existing evaluation benchmarks for assessing this criterion are tied to sense inventories (usually WordNet), restricting their usage to a small subset of knowledge-based representation techniques. The Word-in-Context dataset (WiC) addresses the dependence on sense inventories by reformulating the standard disambiguation task as a binary classification problem; but, it is limited to the English language. We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages from varied language families and with different degrees of resource availability, opening room for evaluation scenarios such as zero-shot cross-lingual transfer. We perform a series of experiments to determine the reliability of the datasets and to set performance baselines for several recent contextualized multilingual models. Experimental results show that even when no tagged instances are available for a target language, models trained solely on the English data can attain competitive performance in the task of distinguishing different meanings of a word, even for distant languages. XL-WiC is available at https://pilehvar.github.io/xlwic/.

* EMNLP2020

Via

Access Paper or Ask Questions

Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation

Feb 24, 2020

Alessandro Raganato, Yves Scherrer, Jörg Tiedemann

Figure 1 for Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation

Figure 2 for Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation

Figure 3 for Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation

Figure 4 for Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation

Abstract:Transformer-based models have brought a radical change to neural machine translation. A key feature of the Transformer architecture is the so-called multi-head attention mechanism, which allows the model to focus simultaneously on different parts of the input. However, recent works have shown that attention heads learn simple positional patterns which are often redundant. In this paper, we propose to replace all but one attention head of each encoder layer with fixed -- non-learnable -- attentive patterns that are solely based on position and do not require any external knowledge. Our experiments show that fixing the attention heads on the encoder side of the Transformer at training time does not impact the translation quality and even increases BLEU scores by up to 3 points in low-resource scenarios.

Via

Access Paper or Ask Questions

The University of Helsinki submissions to the WMT19 news translation task

Jun 10, 2019

Aarne Talman, Umut Sulubacak, Raúl Vázquez, Yves Scherrer, Sami Virpioja, Alessandro Raganato, Arvi Hurskainen, Jörg Tiedemann

Figure 1 for The University of Helsinki submissions to the WMT19 news translation task

Figure 2 for The University of Helsinki submissions to the WMT19 news translation task

Figure 3 for The University of Helsinki submissions to the WMT19 news translation task

Figure 4 for The University of Helsinki submissions to the WMT19 news translation task

Abstract:In this paper, we present the University of Helsinki submissions to the WMT 2019 shared task on news translation in three language pairs: English-German, English-Finnish and Finnish-English. This year, we focused first on cleaning and filtering the training data using multiple data-filtering approaches, resulting in much smaller and cleaner training sets. For English-German, we trained both sentence-level transformer models and compared different document-level translation approaches. For Finnish-English and English-Finnish we focused on different segmentation approaches, and we also included a rule-based system for English-Finnish.

* To appear in WMT19

Via

Access Paper or Ask Questions

Multilingual NMT with a language-independent attention bridge

Nov 01, 2018

Raúl Vázquez, Alessandro Raganato, Jörg Tiedemann, Mathias Creutz

Figure 1 for Multilingual NMT with a language-independent attention bridge

Figure 2 for Multilingual NMT with a language-independent attention bridge

Figure 3 for Multilingual NMT with a language-independent attention bridge

Figure 4 for Multilingual NMT with a language-independent attention bridge

Abstract:In this paper, we propose a multilingual encoder-decoder architecture capable of obtaining multilingual sentence representations by means of incorporating an intermediate {\em attention bridge} that is shared across all languages. That is, we train the model with language-specific encoders and decoders that are connected via self-attention with a shared layer that we call attention bridge. This layer exploits the semantics from each language for performing translation and develops into a language-independent meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual NMT using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. We show that the model achieves substantial improvements over strong bilingual models and that it also works well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning.

Via

Access Paper or Ask Questions

A Large-Scale Multilingual Disambiguation of Glosses

Aug 24, 2016

José Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, Roberto Navigli

Figure 1 for A Large-Scale Multilingual Disambiguation of Glosses

Figure 2 for A Large-Scale Multilingual Disambiguation of Glosses

Figure 3 for A Large-Scale Multilingual Disambiguation of Glosses

Figure 4 for A Large-Scale Multilingual Disambiguation of Glosses

Abstract:Linking concepts and named entities to knowledge bases has become a crucial Natural Language Understanding task. In this respect, recent works have shown the key advantage of exploiting textual definitions in various Natural Language Processing applications. However, to date there are no reliable large-scale corpora of sense-annotated textual definitions available to the research community. In this paper we present a large-scale high-quality corpus of disambiguated glosses in multiple languages, comprising sense annotations of both concepts and named entities from a unified sense inventory. Our approach for the construction and disambiguation of the corpus builds upon the structure of a large multilingual semantic network and a state-of-the-art disambiguation system; first, we gather complementary information of equivalent definitions across different languages to provide context for disambiguation, and then we combine it with a semantic similarity-based refinement. As a result we obtain a multilingual corpus of textual definitions featuring over 38 million definitions in 263 languages, and we make it freely available at http://lcl.uniroma1.it/disambiguated-glosses. Experiments on Open Information Extraction and Sense Clustering show how two state-of-the-art approaches improve their performance by integrating our disambiguated corpus into their pipeline.

* Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC), 2016, pages 1701-1708, Portoroz, Slovenia
* Accepted in LREC 2016

Via

Access Paper or Ask Questions