Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Thinh Hung Truong

Revisiting subword tokenization: A case study on affixal negation in large language models

Apr 04, 2024

Thinh Hung Truong, Yulia Otmakhova, Karin Verspoor, Trevor Cohn, Timothy Baldwin

Figure 1 for Revisiting subword tokenization: A case study on affixal negation in large language models

Figure 2 for Revisiting subword tokenization: A case study on affixal negation in large language models

Figure 3 for Revisiting subword tokenization: A case study on affixal negation in large language models

Figure 4 for Revisiting subword tokenization: A case study on affixal negation in large language models

Abstract:In this work, we measure the impact of affixal negation on modern English large language models (LLMs). In affixal negation, the negated meaning is expressed through a negative morpheme, which is potentially challenging for LLMs as their tokenizers are often not morphologically plausible. We conduct extensive experiments using LLMs with different subword tokenization methods, which lead to several insights on the interaction between tokenization performance and negation sensitivity. Despite some interesting mismatches between tokenization accuracy and negation detection performance, we show that models can, on the whole, reliably recognize the meaning of affixal negation.

* NAACL 2024

Via

Access Paper or Ask Questions

Language models are not naysayers: An analysis of language models on negation benchmarks

Jun 14, 2023

Thinh Hung Truong, Timothy Baldwin, Karin Verspoor, Trevor Cohn

Abstract:Negation has been shown to be a major bottleneck for masked language models, such as BERT. However, whether this finding still holds for larger-sized auto-regressive language models (``LLMs'') has not been studied comprehensively. With the ever-increasing volume of research and applications of LLMs, we take a step back to evaluate the ability of current-generation LLMs to handle negation, a fundamental linguistic phenomenon that is central to language understanding. We evaluate different LLMs -- including the open-source GPT-neo, GPT-3, and InstructGPT -- against a wide range of negation benchmarks. Through systematic experimentation with varying model sizes and prompts, we show that LLMs have several limitations including insensitivity to the presence of negation, an inability to capture the lexical semantics of negation, and a failure to reason under negation.

Via

Access Paper or Ask Questions

Automated Metrics for Medical Multi-Document Summarization Disagree with Human Evaluations

May 23, 2023

Lucy Lu Wang, Yulia Otmakhova, Jay DeYoung, Thinh Hung Truong, Bailey E. Kuehl, Erin Bransom, Byron C. Wallace

Abstract:Evaluating multi-document summarization (MDS) quality is difficult. This is especially true in the case of MDS for biomedical literature reviews, where models must synthesize contradicting evidence reported across different documents. Prior work has shown that rather than performing the task, models may exploit shortcuts that are difficult to detect using standard n-gram similarity metrics such as ROUGE. Better automated evaluation metrics are needed, but few resources exist to assess metrics when they are proposed. Therefore, we introduce a dataset of human-assessed summary quality facets and pairwise preferences to encourage and support the development of better automated evaluation methods for literature review MDS. We take advantage of community submissions to the Multi-document Summarization for Literature Review (MSLR) shared task to compile a diverse and representative sample of generated summaries. We analyze how automated summarization evaluation metrics correlate with lexical features of generated summaries, to other automated metrics including several we propose in this work, and to aspects of human-assessed summary quality. We find that not only do automated metrics fail to capture aspects of quality as assessed by humans, in many cases the system rankings produced by these metrics are anti-correlated with rankings according to human annotators.

* ACL 2023; Github: https://github.com/allenai/mslr-annotated-dataset

Via

Access Paper or Ask Questions

From Disfluency Detection to Intent Detection and Slot Filling

Sep 17, 2022

Mai Hoang Dao, Thinh Hung Truong, Dat Quoc Nguyen

Figure 1 for From Disfluency Detection to Intent Detection and Slot Filling

Figure 2 for From Disfluency Detection to Intent Detection and Slot Filling

Figure 3 for From Disfluency Detection to Intent Detection and Slot Filling

Figure 4 for From Disfluency Detection to Intent Detection and Slot Filling

Abstract:We present the first empirical study investigating the influence of disfluency detection on downstream tasks of intent detection and slot filling. We perform this study for Vietnamese -- a low-resource language that has no previous study as well as no public dataset available for disfluency detection. First, we extend the fluent Vietnamese intent detection and slot filling dataset PhoATIS by manually adding contextual disfluencies and annotating them. Then, we conduct experiments using strong baselines for disfluency detection and joint intent detection and slot filling, which are based on pre-trained language models. We find that: (i) disfluencies produce negative effects on the performances of the downstream intent detection and slot filling tasks, and (ii) in the disfluency context, the pre-trained multilingual language model XLM-R helps produce better intent detection and slot filling performances than the pre-trained monolingual language model PhoBERT, and this is opposite to what generally found in the fluency context.

* In Proceedings of INTERSPEECH 2022

Via

Access Paper or Ask Questions

Improving negation detection with negation-focused pre-training

May 09, 2022

Thinh Hung Truong, Timothy Baldwin, Trevor Cohn, Karin Verspoor

Figure 1 for Improving negation detection with negation-focused pre-training

Figure 2 for Improving negation detection with negation-focused pre-training

Figure 3 for Improving negation detection with negation-focused pre-training

Figure 4 for Improving negation detection with negation-focused pre-training

Abstract:Negation is a common linguistic feature that is crucial in many language understanding tasks, yet it remains a hard problem due to diversity in its expression in different types of text. Recent work has shown that state-of-the-art NLP models underperform on samples containing negation in various tasks, and that negation detection models do not transfer well across domains. We propose a new negation-focused pre-training strategy, involving targeted data augmentation and negation masking, to better incorporate negation information into language models. Extensive experiments on common benchmarks show that our proposed approach improves negation detection performance and generalizability over the strong baseline NegBERT (Khandewal and Sawant, 2020).

Via

Access Paper or Ask Questions

ITTC @ TREC 2021 Clinical Trials Track

Feb 16, 2022

Thinh Hung Truong, Yulia Otmakhova, Rahmad Mahendra, Timothy Baldwin, Jey Han Lau, Trevor Cohn, Lawrence Cavedon, Damiano Spina, Karin Verspoor

Figure 1 for ITTC @ TREC 2021 Clinical Trials Track

Figure 2 for ITTC @ TREC 2021 Clinical Trials Track

Abstract:This paper describes the submissions of the Natural Language Processing (NLP) team from the Australian Research Council Industrial Transformation Training Centre (ITTC) for Cognitive Computing in Medical Technologies to the TREC 2021 Clinical Trials Track. The task focuses on the problem of matching eligible clinical trials to topics constituting a summary of a patient's admission notes. We explore different ways of representing trials and topics using NLP techniques, and then use a common retrieval model to generate the ranked list of relevant trials for each topic. The results from all our submitted runs are well above the median scores for all topics, but there is still plenty of scope for improvement.

* 7 pages

Via

Access Paper or Ask Questions

COVID-19 Named Entity Recognition for Vietnamese

Apr 08, 2021

Thinh Hung Truong, Mai Hoang Dao, Dat Quoc Nguyen

Figure 1 for COVID-19 Named Entity Recognition for Vietnamese

Figure 2 for COVID-19 Named Entity Recognition for Vietnamese

Figure 3 for COVID-19 Named Entity Recognition for Vietnamese

Figure 4 for COVID-19 Named Entity Recognition for Vietnamese

Abstract:The current COVID-19 pandemic has lead to the creation of many corpora that facilitate NLP research and downstream applications to help fight the pandemic. However, most of these corpora are exclusively for English. As the pandemic is a global problem, it is worth creating COVID-19 related datasets for languages other than English. In this paper, we present the first manually-annotated COVID-19 domain-specific dataset for Vietnamese. Particularly, our dataset is annotated for the named entity recognition (NER) task with newly-defined entity types that can be used in other future epidemics. Our dataset also contains the largest number of entities compared to existing Vietnamese NER datasets. We empirically conduct experiments using strong baselines on our dataset, and find that: automatic Vietnamese word segmentation helps improve the NER results and the highest performances are obtained by fine-tuning pre-trained language models where the monolingual model PhoBERT for Vietnamese (Nguyen and Nguyen, 2020) produces higher results than the multilingual model XLM-R (Conneau et al., 2020). We publicly release our dataset at: https://github.com/VinAIResearch/PhoNER_COVID19

* To appear in Proceedings of NAACL 2021

Via

Access Paper or Ask Questions

Intent detection and slot filling for Vietnamese

Apr 05, 2021

Mai Hoang Dao, Thinh Hung Truong, Dat Quoc Nguyen

Figure 1 for Intent detection and slot filling for Vietnamese

Figure 2 for Intent detection and slot filling for Vietnamese

Figure 3 for Intent detection and slot filling for Vietnamese

Figure 4 for Intent detection and slot filling for Vietnamese

Abstract:Intent detection and slot filling are important tasks in spoken and natural language understanding. However, Vietnamese is a low-resource language in these research topics. In this paper, we present the first public intent detection and slot filling dataset for Vietnamese. In addition, we also propose a joint model for intent detection and slot filling, that extends the recent state-of-the-art JointBERT+CRF model with an intent-slot attention layer in order to explicitly incorporate intent context information into slot filling via "soft" intent label embedding. Experimental results on our Vietnamese dataset show that our proposed model significantly outperforms JointBERT+CRF. We publicly release our dataset and the implementation of our model at: https://github.com/VinAIResearch/JointIDSF

Via

Access Paper or Ask Questions