Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lorenzo De Mattei

On the interaction of automatic evaluation and task framing in headline style transfer

Jan 05, 2021

Lorenzo De Mattei, Michele Cafagna, Huiyuan Lai, Felice Dell'Orletta, Malvina Nissim, Albert Gatt

Figure 1 for On the interaction of automatic evaluation and task framing in headline style transfer

Figure 2 for On the interaction of automatic evaluation and task framing in headline style transfer

Figure 3 for On the interaction of automatic evaluation and task framing in headline style transfer

Abstract:An ongoing debate in the NLG community concerns the best way to evaluate systems, with human evaluation often being considered the most reliable method, compared to corpus-based metrics. However, tasks involving subtle textual differences, such as style transfer, tend to be hard for humans to perform. In this paper, we propose an evaluation method for this task based on purposely-trained classifiers, showing that it better reflects system differences than traditional metrics such as BLEU and ROUGE.

Via

Access Paper or Ask Questions

GePpeTto Carves Italian into a Language Model

Apr 29, 2020

Lorenzo De Mattei, Michele Cafagna, Felice Dell'Orletta, Malvina Nissim, Marco Guerini

Figure 1 for GePpeTto Carves Italian into a Language Model

Figure 2 for GePpeTto Carves Italian into a Language Model

Figure 3 for GePpeTto Carves Italian into a Language Model

Figure 4 for GePpeTto Carves Italian into a Language Model

Abstract:In the last few years, pre-trained neural architectures have provided impressive improvements across several NLP tasks. Still, generative language models are available mainly for English. We develop GePpeTto, the first generative language model for Italian, built using the GPT-2 architecture. We provide a thorough analysis of GePpeTto's quality by means of both an automatic and a human-based evaluation. The automatic assessment consists in (i) calculating perplexity across different genres and (ii) a profiling analysis over GePpeTto's writing characteristics. We find that GePpeTto's production is a sort of bonsai version of human production, with shorter but yet complex sentences. Human evaluation is performed over a sentence completion task, where GePpeTto's output is judged as natural more often than not, and much closer to the original human texts than to a simpler language model which we take as baseline.

Via

Access Paper or Ask Questions