Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ines Reinig

ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications

Mar 08, 2024

Sotaro Takeshita, Tommaso Green, Ines Reinig, Kai Eckert, Simone Paolo Ponzetto

Figure 1 for ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications

Figure 2 for ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications

Figure 3 for ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications

Figure 4 for ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications

Abstract:Extensive efforts in the past have been directed toward the development of summarization datasets. However, a predominant number of these resources have been (semi)-automatically generated, typically through web data crawling, resulting in subpar resources for training and evaluating summarization systems, a quality compromise that is arguably due to the substantial costs associated with generating ground-truth summaries, particularly for diverse languages and specialized domains. To address this issue, we present ACLSum, a novel summarization dataset carefully crafted and evaluated by domain experts. In contrast to previous datasets, ACLSum facilitates multi-aspect summarization of scientific papers, covering challenges, approaches, and outcomes in depth. Through extensive experiments, we evaluate the quality of our resource and the performance of models based on pretrained language models and state-of-the-art large language models (LLMs). Additionally, we explore the effectiveness of extractive versus abstractive summarization within the scholarly domain on the basis of automatically discovered aspects. Our results corroborate previous findings in the general domain and indicate the general superiority of end-to-end aspect-based summarization. Our data is released at https://github.com/sobamchan/aclsum.

Via

Access Paper or Ask Questions

Can current NLI systems handle German word order? Investigating language model performance on a new German challenge set of minimal pairs

Jun 07, 2023

Ines Reinig, Katja Markert

Figure 1 for Can current NLI systems handle German word order? Investigating language model performance on a new German challenge set of minimal pairs

Figure 2 for Can current NLI systems handle German word order? Investigating language model performance on a new German challenge set of minimal pairs

Figure 3 for Can current NLI systems handle German word order? Investigating language model performance on a new German challenge set of minimal pairs

Figure 4 for Can current NLI systems handle German word order? Investigating language model performance on a new German challenge set of minimal pairs

Abstract:Compared to English, German word order is freer and therefore poses additional challenges for natural language inference (NLI). We create WOGLI (Word Order in German Language Inference), the first adversarial NLI dataset for German word order that has the following properties: (i) each premise has an entailed and a non-entailed hypothesis; (ii) premise and hypotheses differ only in word order and necessary morphological changes to mark case and number. In particular, each premise andits two hypotheses contain exactly the same lemmata. Our adversarial examples require the model to use morphological markers in order to recognise or reject entailment. We show that current German autoencoding models fine-tuned on translated NLI data can struggle on this challenge set, reflecting the fact that translated NLI datasets will not mirror all necessary language phenomena in the target language. We also examine performance after data augmentation as well as on related word order phenomena derived from WOGLI. Our datasets are publically available at https://github.com/ireinig/wogli.

Via

Access Paper or Ask Questions