Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mansoor Ahmad

Synthetic Data Augmentation for Cross-domain Implicit Discourse Relation Recognition

Mar 26, 2025

Frances Yung, Varsha Suresh, Zaynab Reza, Mansoor Ahmad, Vera Demberg

Figure 1 for Synthetic Data Augmentation for Cross-domain Implicit Discourse Relation Recognition

Figure 2 for Synthetic Data Augmentation for Cross-domain Implicit Discourse Relation Recognition

Figure 3 for Synthetic Data Augmentation for Cross-domain Implicit Discourse Relation Recognition

Figure 4 for Synthetic Data Augmentation for Cross-domain Implicit Discourse Relation Recognition

Abstract:Implicit discourse relation recognition (IDRR) -- the task of identifying the implicit coherence relation between two text spans -- requires deep semantic understanding. Recent studies have shown that zero- or few-shot approaches significantly lag behind supervised models, but LLMs may be useful for synthetic data augmentation, where LLMs generate a second argument following a specified coherence relation. We applied this approach in a cross-domain setting, generating discourse continuations using unlabelled target-domain data to adapt a base model which was trained on source-domain labelled data. Evaluations conducted on a large-scale test set revealed that different variations of the approach did not result in any significant improvements. We conclude that LLMs often fail to generate useful samples for IDRR, and emphasize the importance of considering both statistical significance and comparability when evaluating IDRR models.

Via

Access Paper or Ask Questions

Prompting Implicit Discourse Relation Annotation

Feb 07, 2024

Frances Yung, Mansoor Ahmad, Merel Scholman, Vera Demberg

Abstract:Pre-trained large language models, such as ChatGPT, archive outstanding performance in various reasoning tasks without supervised training and were found to have outperformed crowdsourcing workers. Nonetheless, ChatGPT's performance in the task of implicit discourse relation classification, prompted by a standard multiple-choice question, is still far from satisfactory and considerably inferior to state-of-the-art supervised approaches. This work investigates several proven prompting techniques to improve ChatGPT's recognition of discourse relations. In particular, we experimented with breaking down the classification task that involves numerous abstract labels into smaller subtasks. Nonetheless, experiment results show that the inference accuracy hardly changes even with sophisticated prompt engineering, suggesting that implicit discourse relation classification is not yet resolvable under zero-shot or few-shot settings.

* To appear at the Linguistic Annotation Workshop 2024

Via

Access Paper or Ask Questions