Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Varsha Suresh

Synthetic Data Augmentation for Cross-domain Implicit Discourse Relation Recognition

Mar 26, 2025

Frances Yung, Varsha Suresh, Zaynab Reza, Mansoor Ahmad, Vera Demberg

Abstract:Implicit discourse relation recognition (IDRR) -- the task of identifying the implicit coherence relation between two text spans -- requires deep semantic understanding. Recent studies have shown that zero- or few-shot approaches significantly lag behind supervised models, but LLMs may be useful for synthetic data augmentation, where LLMs generate a second argument following a specified coherence relation. We applied this approach in a cross-domain setting, generating discourse continuations using unlabelled target-domain data to adapt a base model which was trained on source-domain labelled data. Evaluations conducted on a large-scale test set revealed that different variations of the approach did not result in any significant improvements. We conclude that LLMs often fail to generate useful samples for IDRR, and emphasize the importance of considering both statistical significance and comparability when evaluating IDRR models.

Via

Access Paper or Ask Questions

Enhancing Spoken Discourse Modeling in Language Models Using Gestural Cues

Mar 05, 2025

Varsha Suresh, M. Hamza Mughal, Christian Theobalt, Vera Demberg

Abstract:Research in linguistics shows that non-verbal cues, such as gestures, play a crucial role in spoken discourse. For example, speakers perform hand gestures to indicate topic shifts, helping listeners identify transitions in discourse. In this work, we investigate whether the joint modeling of gestures using human motion sequences and language can improve spoken discourse modeling in language models. To integrate gestures into language models, we first encode 3D human motion sequences into discrete gesture tokens using a VQ-VAE. These gesture token embeddings are then aligned with text embeddings through feature alignment, mapping them into the text embedding space. To evaluate the gesture-aligned language model on spoken discourse, we construct text infilling tasks targeting three key discourse cues grounded in linguistic research: discourse connectives, stance markers, and quantifiers. Results show that incorporating gestures enhances marker prediction accuracy across the three tasks, highlighting the complementary information that gestures can offer in modeling spoken discourse. We view this work as an initial step toward leveraging non-verbal cues to advance spoken language modeling in language models.

Via

Access Paper or Ask Questions

An Adapter-Based Unified Model for Multiple Spoken Language Processing Tasks

Jun 20, 2024

Varsha Suresh, Salah Aït-Mokhtar, Caroline Brun, Ioan Calapodescu

Abstract:Self-supervised learning models have revolutionized the field of speech processing. However, the process of fine-tuning these models on downstream tasks requires substantial computational resources, particularly when dealing with multiple speech-processing tasks. In this paper, we explore the potential of adapter-based fine-tuning in developing a unified model capable of effectively handling multiple spoken language processing tasks. The tasks we investigate are Automatic Speech Recognition, Phoneme Recognition, Intent Classification, Slot Filling, and Spoken Emotion Recognition. We validate our approach through a series of experiments on the SUPERB benchmark, and our results indicate that adapter-based fine-tuning enables a single encoder-decoder model to perform multiple speech processing tasks with an average improvement of 18.4% across the five target tasks while staying efficient in terms of parameter updates.

* ICASSP 2024

Via

Access Paper or Ask Questions

Using Positive Matching Contrastive Loss with Facial Action Units to mitigate bias in Facial Expression Recognition

Mar 08, 2023

Varsha Suresh, Desmond C. Ong

Abstract:Machine learning models automatically learn discriminative features from the data, and are therefore susceptible to learn strongly-correlated biases, such as using protected attributes like gender and race. Most existing bias mitigation approaches aim to explicitly reduce the model's focus on these protected features. In this work, we propose to mitigate bias by explicitly guiding the model's focus towards task-relevant features using domain knowledge, and we hypothesize that this can indirectly reduce the dependence of the model on spurious correlations it learns from the data. We explore bias mitigation in facial expression recognition systems using facial Action Units (AUs) as the task-relevant feature. To this end, we introduce Feature-based Positive Matching Contrastive Loss which learns the distances between the positives of a sample based on the similarity between their corresponding AU embeddings. We compare our approach with representative baselines and show that incorporating task-relevant features via our method can improve model fairness at minimal cost to classification performance.

* 10th International Conference on Affective Computing and Intelligent Interaction (ACII), 2022

Via

Access Paper or Ask Questions

Not All Negatives are Equal: Label-Aware Contrastive Loss for Fine-grained Text Classification

Sep 12, 2021

Varsha Suresh, Desmond C. Ong

Figure 1 for Not All Negatives are Equal: Label-Aware Contrastive Loss for Fine-grained Text Classification

Figure 2 for Not All Negatives are Equal: Label-Aware Contrastive Loss for Fine-grained Text Classification

Figure 3 for Not All Negatives are Equal: Label-Aware Contrastive Loss for Fine-grained Text Classification

Figure 4 for Not All Negatives are Equal: Label-Aware Contrastive Loss for Fine-grained Text Classification

Abstract:Fine-grained classification involves dealing with datasets with larger number of classes with subtle differences between them. Guiding the model to focus on differentiating dimensions between these commonly confusable classes is key to improving performance on fine-grained tasks. In this work, we analyse the contrastive fine-tuning of pre-trained language models on two fine-grained text classification tasks, emotion classification and sentiment analysis. We adaptively embed class relationships into a contrastive objective function to help differently weigh the positives and negatives, and in particular, weighting closely confusable negatives more than less similar negative examples. We find that Label-aware Contrastive Loss outperforms previous contrastive methods, in the presence of larger number and/or more confusable classes, and helps models to produce output distributions that are more differentiated.

* Accepted at EMNLP 2021

Via

Access Paper or Ask Questions

Using Knowledge-Embedded Attention to Augment Pre-trained Language Models for Fine-Grained Emotion Recognition

Jul 31, 2021

Varsha Suresh, Desmond C. Ong

Figure 1 for Using Knowledge-Embedded Attention to Augment Pre-trained Language Models for Fine-Grained Emotion Recognition

Figure 2 for Using Knowledge-Embedded Attention to Augment Pre-trained Language Models for Fine-Grained Emotion Recognition

Figure 3 for Using Knowledge-Embedded Attention to Augment Pre-trained Language Models for Fine-Grained Emotion Recognition

Figure 4 for Using Knowledge-Embedded Attention to Augment Pre-trained Language Models for Fine-Grained Emotion Recognition

Abstract:Modern emotion recognition systems are trained to recognize only a small set of emotions, and hence fail to capture the broad spectrum of emotions people experience and express in daily life. In order to engage in more empathetic interactions, future AI has to perform \textit{fine-grained} emotion recognition, distinguishing between many more varied emotions. Here, we focus on improving fine-grained emotion recognition by introducing external knowledge into a pre-trained self-attention model. We propose Knowledge-Embedded Attention (KEA) to use knowledge from emotion lexicons to augment the contextual representations from pre-trained ELECTRA and BERT models. Our results and error analyses outperform previous models on several datasets, and is better able to differentiate closely-confusable emotions, such as afraid and terrified.

* Accepted at IEEE Affective Computing and Intelligent Interaction (ACII) 2021

Via

Access Paper or Ask Questions

A Systematic Evaluation of Domain Adaptation in Facial Expression Recognition

Jun 29, 2021

Yan San Kong, Varsha Suresh, Jonathan Soh, Desmond C. Ong

Figure 1 for A Systematic Evaluation of Domain Adaptation in Facial Expression Recognition

Figure 2 for A Systematic Evaluation of Domain Adaptation in Facial Expression Recognition

Figure 3 for A Systematic Evaluation of Domain Adaptation in Facial Expression Recognition

Figure 4 for A Systematic Evaluation of Domain Adaptation in Facial Expression Recognition

Abstract:Facial Expression Recognition is a commercially important application, but one common limitation is that applications often require making predictions on out-of-sample distributions, where target images may have very different properties from the images that the model was trained on. How well, or badly, do these models do on unseen target domains? In this paper, we provide a systematic evaluation of domain adaptation in facial expression recognition. Using state-of-the-art transfer learning techniques and six commonly-used facial expression datasets (three collected in the lab and three "in-the-wild"), we conduct extensive round-robin experiments to examine the classification accuracies for a state-of-the-art CNN model. We also perform multi-source experiments where we examine a model's ability to transfer from multiple source datasets, including (i) within-setting (e.g., lab to lab), (ii) cross-setting (e.g., in-the-wild to lab), (iii) mixed-setting (e.g., lab and wild to lab) transfer learning experiments. We find sobering results that the accuracy of transfer learning is not high, and varies idiosyncratically with the target dataset, and to a lesser extent the source dataset. Generally, the best settings for transfer include fine-tuning the weights of a pre-trained model, and we find that training with more datasets, regardless of setting, improves transfer performance. We end with a discussion of the need for more -- and regular -- systematic investigations into the generalizability of FER models, especially for deployed applications.

Via

Access Paper or Ask Questions

Shape-CD: Change-Point Detection in Time-Series Data with Shapes and Neurons

Aug 01, 2020

Varsha Suresh, Wei Tsang Ooi

Figure 1 for Shape-CD: Change-Point Detection in Time-Series Data with Shapes and Neurons

Figure 2 for Shape-CD: Change-Point Detection in Time-Series Data with Shapes and Neurons

Figure 3 for Shape-CD: Change-Point Detection in Time-Series Data with Shapes and Neurons

Figure 4 for Shape-CD: Change-Point Detection in Time-Series Data with Shapes and Neurons

Abstract:Change-point detection in a time series aims to discover the time points at which some unknown underlying physical process that generates the time-series data has changed. We found that existing approaches become less accurate when the underlying process is complex and generates large varieties of patterns in the time series. To address this shortcoming, we propose Shape-CD, a simple, fast, and accurate change point detection method. Shape-CD uses shape-based features to model the patterns and a conditional neural field to model the temporal correlations among the time regions. We evaluated the performance of Shape-CD using four highly dynamic time-series datasets, including the ExtraSensory dataset with up to 2000 classes. Shape-CD demonstrated improved accuracy (7-60% higher in AUC) and faster computational speed compared to existing approaches. Furthermore, the Shape-CD model consists of only hundreds of parameters and require less data to train than other deep supervised learning models.

* The authors have withdrawn this paper as it needs a major revision. An error in the evaluation code invalidates the reported results

Via

Access Paper or Ask Questions