Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Anna Palatkina

KoWit-24: A Richly Annotated Dataset of Wordplay in News Headlines

Mar 03, 2025

Alexander Baranov, Anna Palatkina, Yulia Makovka, Pavel Braslavski

Figure 1 for KoWit-24: A Richly Annotated Dataset of Wordplay in News Headlines

Figure 2 for KoWit-24: A Richly Annotated Dataset of Wordplay in News Headlines

Figure 3 for KoWit-24: A Richly Annotated Dataset of Wordplay in News Headlines

Figure 4 for KoWit-24: A Richly Annotated Dataset of Wordplay in News Headlines

Abstract:We present KoWit-24, a dataset with fine-grained annotation of wordplay in 2,700 Russian news headlines. KoWit-24 annotations include the presence of wordplay, its type, wordplay anchors, and words/phrases the wordplay refers to. Unlike the majority of existing humor collections of canned jokes, KoWit-24 provides wordplay contexts -- each headline is accompanied by the news lead and summary. The most common type of wordplay in the dataset is the transformation of collocations, idioms, and named entities -- the mechanism that has been underrepresented in previous humor datasets. Our experiments with five LLMs show that there is ample room for improvement in wordplay detection and interpretation tasks. The dataset and evaluation scripts are available at https://github.com/Humor-Research/KoWit-24

Via

Access Paper or Ask Questions

NorBench -- A Benchmark for Norwegian Language Models

May 06, 2023

David Samuel, Andrey Kutuzov, Samia Touileb, Erik Velldal, Lilja Øvrelid, Egil Rønningstad, Elina Sigdel, Anna Palatkina

Figure 1 for NorBench -- A Benchmark for Norwegian Language Models

Figure 2 for NorBench -- A Benchmark for Norwegian Language Models

Figure 3 for NorBench -- A Benchmark for Norwegian Language Models

Figure 4 for NorBench -- A Benchmark for Norwegian Language Models

Abstract:We present NorBench: a streamlined suite of NLP tasks and probes for evaluating Norwegian language models (LMs) on standardized data splits and evaluation metrics. We also introduce a range of new Norwegian language models (both encoder and encoder-decoder based). Finally, we compare and analyze their performance, along with other existing LMs, across the different benchmark tests of NorBench.

* Accepted to NoDaLiDa 2023

Via

Access Paper or Ask Questions