Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Iris Hendrickx

BrightCookies at SemEval-2025 Task 9: Exploring Data Augmentation for Food Hazard Classification

Apr 29, 2025

Foteini Papadopoulou, Osman Mutlu, Neris Özen, Bas H. M. van der Velden, Iris Hendrickx, Ali Hürriyetoğlu

Abstract:This paper presents our system developed for the SemEval-2025 Task 9: The Food Hazard Detection Challenge. The shared task's objective is to evaluate explainable classification systems for classifying hazards and products in two levels of granularity from food recall incident reports. In this work, we propose text augmentation techniques as a way to improve poor performance on minority classes and compare their effect for each category on various transformer and machine learning models. We explore three word-level data augmentation techniques, namely synonym replacement, random word swapping, and contextual word insertion. The results show that transformer models tend to have a better overall performance. None of the three augmentation techniques consistently improved overall performance for classifying hazards and products. We observed a statistically significant improvement (P < 0.05) in the fine-grained categories when using the BERT model to compare the baseline with each augmented model. Compared to the baseline, the contextual words insertion augmentation improved the accuracy of predictions for the minority hazard classes by 6%. This suggests that targeted augmentation of minority classes can improve the performance of transformer models.

Via

Access Paper or Ask Questions

SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals

Nov 23, 2019

Iris Hendrickx, Su Nam Kim, Zornitsa Kozareva, Preslav Nakov, Diarmuid Ó Séaghdha, Sebastian Padó, Marco Pennacchiotti, Lorenza Romano, Stan Szpakowicz

Figure 1 for SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals

Figure 2 for SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals

Figure 3 for SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals

Abstract:In response to the continuing research interest in computational semantic analysis, we have proposed a new task for SemEval-2010: multi-way classification of mutually exclusive semantic relations between pairs of nominals. The task is designed to compare different approaches to the problem and to provide a standard testbed for future research. In this paper, we define the task, describe the creation of the datasets, and discuss the results of the participating 28 systems submitted by 10 teams.

* SemEval-2010
* semantic relations, nominals

Via

Access Paper or Ask Questions

SemEval-2013 Task 4: Free Paraphrases of Noun Compounds

Nov 23, 2019

Iris Hendrickx, Preslav Nakov, Stan Szpakowicz, Zornitsa Kozareva, Diarmuid Ó Séaghdha, Tony Veale

Figure 1 for SemEval-2013 Task 4: Free Paraphrases of Noun Compounds

Figure 2 for SemEval-2013 Task 4: Free Paraphrases of Noun Compounds

Abstract:In this paper, we describe SemEval-2013 Task 4: the definition, the data, the evaluation and the results. The task is to capture some of the meaning of English noun compounds via paraphrasing. Given a two-word noun compound, the participating system is asked to produce an explicitly ranked list of its free-form paraphrases. The list is automatically compared and evaluated against a similarly ranked list of paraphrases proposed by human annotators, recruited and managed through Amazon's Mechanical Turk. The comparison of raw paraphrases is sensitive to syntactic and morphological variation. The "gold" ranking is based on the relative popularity of paraphrases among annotators. To make the ranking more reliable, highly similar paraphrases are grouped, so as to downplay superficial differences in syntax and morphology. Three systems participated in the task. They all beat a simple baseline on one of the two evaluation measures, but not on both measures. This shows that the task is difficult.

* SemEval-2013
* noun compounds, paraphrasing verbs, semantic interpretation, multi-word expressions, MWEs

Via

Access Paper or Ask Questions

Unraveling reported dreams with text analytics

Dec 12, 2016

Iris Hendrickx, Louis Onrust, Florian Kunneman, Ali Hürriyetoğlu, Antal van den Bosch, Wessel Stoop

Figure 1 for Unraveling reported dreams with text analytics

Figure 2 for Unraveling reported dreams with text analytics

Figure 3 for Unraveling reported dreams with text analytics

Abstract:We investigate what distinguishes reported dreams from other personal narratives. The continuity hypothesis, stemming from psychological dream analysis work, states that most dreams refer to a person's daily life and personal concerns, similar to other personal narratives such as diary entries. Differences between the two texts may reveal the linguistic markers of dream text, which could be the basis for new dream analysis work and for the automatic detection of dream descriptions. We used three text analytics methods: text classification, topic modeling, and text coherence analysis, and applied these methods to a balanced set of texts representing dreams, diary entries, and other personal stories. We observed that dream texts could be distinguished from other personal narratives nearly perfectly, mostly based on the presence of uncertainty markers and descriptions of scenes. Important markers for non-dream narratives are specific time expressions and conversational expressions. Dream texts also exhibit a lower discourse coherence than other personal narratives.

Via

Access Paper or Ask Questions