Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michele Resta

Evalita-LLM: Benchmarking Large Language Models on Italian

Feb 04, 2025

Bernardo Magnini, Roberto Zanoli, Michele Resta, Martin Cimmino, Paolo Albano, Marco Madeddu, Viviana Patti

Figure 1 for Evalita-LLM: Benchmarking Large Language Models on Italian

Figure 2 for Evalita-LLM: Benchmarking Large Language Models on Italian

Figure 3 for Evalita-LLM: Benchmarking Large Language Models on Italian

Figure 4 for Evalita-LLM: Benchmarking Large Language Models on Italian

Abstract:We describe Evalita-LLM, a new benchmark designed to evaluate Large Language Models (LLMs) on Italian tasks. The distinguishing and innovative features of Evalita-LLM are the following: (i) all tasks are native Italian, avoiding issues of translating from Italian and potential cultural biases; (ii) in addition to well established multiple-choice tasks, the benchmark includes generative tasks, enabling more natural interaction with LLMs; (iii) all tasks are evaluated against multiple prompts, this way mitigating the model sensitivity to specific prompts and allowing a fairer and objective evaluation. We propose an iterative methodology, where candidate tasks and candidate prompts are validated against a set of LLMs used for development. We report experimental results from the benchmark's development phase, and provide performance statistics for several state-of-the-art LLMs.

* 42 pages, 1 figure, 32 tables

Via

Access Paper or Ask Questions

Self-generated Replay Memories for Continual Neural Machine Translation

Mar 19, 2024

Michele Resta, Davide Bacciu

Abstract:Modern Neural Machine Translation systems exhibit strong performance in several different languages and are constantly improving. Their ability to learn continuously is, however, still severely limited by the catastrophic forgetting issue. In this work, we leverage a key property of encoder-decoder Transformers, i.e. their generative ability, to propose a novel approach to continually learning Neural Machine Translation systems. We show how this can effectively learn on a stream of experiences comprising different languages, by leveraging a replay memory populated by using the model itself as a generator of parallel sentences. We empirically demonstrate that our approach can counteract catastrophic forgetting without requiring explicit memorization of training data. Code will be publicly available upon publication. Code: https://github.com/m-resta/sg-rep

* Accepted at NAACL 2024

Via

Access Paper or Ask Questions