Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tiberiu Sosea

CalibrateMix: Guided-Mixup Calibration of Image Semi-Supervised Models

Nov 17, 2025

Mehrab Mustafy Rahman, Jayanth Mohan, Tiberiu Sosea, Cornelia Caragea

Abstract:Semi-supervised learning (SSL) has demonstrated high performance in image classification tasks by effectively utilizing both labeled and unlabeled data. However, existing SSL methods often suffer from poor calibration, with models yielding overconfident predictions that misrepresent actual prediction likelihoods. Recently, neural networks trained with {\tt mixup} that linearly interpolates random examples from the training set have shown better calibration in supervised settings. However, calibration of neural models remains under-explored in semi-supervised settings. Although effective in supervised model calibration, random mixup of pseudolabels in SSL presents challenges due to the overconfidence and unreliability of pseudolabels. In this work, we introduce CalibrateMix, a targeted mixup-based approach that aims to improve the calibration of SSL models while maintaining or even improving their classification accuracy. Our method leverages training dynamics of labeled and unlabeled samples to identify ``easy-to-learn'' and ``hard-to-learn'' samples, which in turn are utilized in a targeted mixup of easy and hard samples. Experimental results across several benchmark image datasets show that our method achieves lower expected calibration error (ECE) and superior accuracy compared to existing SSL approaches.

Via

Access Paper or Ask Questions

MarginMatch: Improving Semi-Supervised Learning with Pseudo-Margins

Aug 17, 2023

Tiberiu Sosea, Cornelia Caragea

Figure 1 for MarginMatch: Improving Semi-Supervised Learning with Pseudo-Margins

Figure 2 for MarginMatch: Improving Semi-Supervised Learning with Pseudo-Margins

Figure 3 for MarginMatch: Improving Semi-Supervised Learning with Pseudo-Margins

Figure 4 for MarginMatch: Improving Semi-Supervised Learning with Pseudo-Margins

Abstract:We introduce MarginMatch, a new SSL approach combining consistency regularization and pseudo-labeling, with its main novelty arising from the use of unlabeled data training dynamics to measure pseudo-label quality. Instead of using only the model's confidence on an unlabeled example at an arbitrary iteration to decide if the example should be masked or not, MarginMatch also analyzes the behavior of the model on the pseudo-labeled examples as the training progresses, to ensure low quality predictions are masked out. MarginMatch brings substantial improvements on four vision benchmarks in low data regimes and on two large-scale datasets, emphasizing the importance of enforcing high-quality pseudo-labels. Notably, we obtain an improvement in error rate over the state-of-the-art of 3.25% on CIFAR-100 with only 25 labels per class and of 3.78% on STL-10 using as few as 4 labels per class. We make our code available at https://github.com/tsosea2/MarginMatch.

Via

Access Paper or Ask Questions

Sarcasm Detection in a Disaster Context

Aug 16, 2023

Tiberiu Sosea, Junyi Jessy Li, Cornelia Caragea

Abstract:During natural disasters, people often use social media platforms such as Twitter to ask for help, to provide information about the disaster situation, or to express contempt about the unfolding event or public policies and guidelines. This contempt is in some cases expressed as sarcasm or irony. Understanding this form of speech in a disaster-centric context is essential to improving natural language understanding of disaster-related tweets. In this paper, we introduce HurricaneSARC, a dataset of 15,000 tweets annotated for intended sarcasm, and provide a comprehensive investigation of sarcasm detection using pre-trained language models. Our best model is able to obtain as much as 0.70 F1 on our dataset. We also demonstrate that the performance on HurricaneSARC can be improved by leveraging intermediate task transfer learning. We release our data and code at https://github.com/tsosea2/HurricaneSarc.

Via

Access Paper or Ask Questions

Unsupervised Extractive Summarization of Emotion Triggers

Jun 02, 2023

Tiberiu Sosea, Hongli Zhan, Junyi Jessy Li, Cornelia Caragea

Figure 1 for Unsupervised Extractive Summarization of Emotion Triggers

Figure 2 for Unsupervised Extractive Summarization of Emotion Triggers

Figure 3 for Unsupervised Extractive Summarization of Emotion Triggers

Figure 4 for Unsupervised Extractive Summarization of Emotion Triggers

Abstract:Understanding what leads to emotions during large-scale crises is important as it can provide groundings for expressed emotions and subsequently improve the understanding of ongoing disasters. Recent approaches trained supervised models to both detect emotions and explain emotion triggers (events and appraisals) via abstractive summarization. However, obtaining timely and qualitative abstractive summaries is expensive and extremely time-consuming, requiring highly-trained expert annotators. In time-sensitive, high-stake contexts, this can block necessary responses. We instead pursue unsupervised systems that extract triggers from text. First, we introduce CovidET-EXT, augmenting (Zhan et al. 2022)'s abstractive dataset (in the context of the COVID-19 crisis) with extractive triggers. Second, we develop new unsupervised learning models that can jointly detect emotions and summarize their triggers. Our best approach, entitled Emotion-Aware Pagerank, incorporates emotion information from external sources combined with a language understanding module, and outperforms strong baselines. We release our data and code at https://github.com/tsosea2/CovidET-EXT.

* ACL 2023 Camera-Ready

Via

Access Paper or Ask Questions

Why Do You Feel This Way? Summarizing Triggers of Emotions in Social Media Posts

Oct 22, 2022

Hongli Zhan, Tiberiu Sosea, Cornelia Caragea, Junyi Jessy Li

Figure 1 for Why Do You Feel This Way? Summarizing Triggers of Emotions in Social Media Posts

Figure 2 for Why Do You Feel This Way? Summarizing Triggers of Emotions in Social Media Posts

Figure 3 for Why Do You Feel This Way? Summarizing Triggers of Emotions in Social Media Posts

Figure 4 for Why Do You Feel This Way? Summarizing Triggers of Emotions in Social Media Posts

Abstract:Crises such as the COVID-19 pandemic continuously threaten our world and emotionally affect billions of people worldwide in distinct ways. Understanding the triggers leading to people's emotions is of crucial importance. Social media posts can be a good source of such analysis, yet these texts tend to be charged with multiple emotions, with triggers scattering across multiple sentences. This paper takes a novel angle, namely, emotion detection and trigger summarization, aiming to both detect perceived emotions in text, and summarize events and their appraisals that trigger each emotion. To support this goal, we introduce CovidET (Emotions and their Triggers during Covid-19), a dataset of ~1,900 English Reddit posts related to COVID-19, which contains manual annotations of perceived emotions and abstractive summaries of their triggers described in the post. We develop strong baselines to jointly detect emotions and summarize emotion triggers. Our analyses show that CovidET presents new challenges in emotion-specific summarization, as well as multi-emotion detection in long social media posts.

* EMNLP 2022 Camera Ready Version

Via

Access Paper or Ask Questions