Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Measuring the Robustness of Natural Language Processing Models to Domain Shifts

May 31, 2023

Nitay Calderon, Naveh Porat, Eyal Ben-David, Zorik Gekhman, Nadav Oved, Roi Reichart

Figure 1 for Measuring the Robustness of Natural Language Processing Models to Domain Shifts

Figure 2 for Measuring the Robustness of Natural Language Processing Models to Domain Shifts

Figure 3 for Measuring the Robustness of Natural Language Processing Models to Domain Shifts

Figure 4 for Measuring the Robustness of Natural Language Processing Models to Domain Shifts

Share this with someone who'll enjoy it:

Abstract:Large Language Models have shown promising performance on various tasks, including fine-tuning, few-shot learning, and zero-shot learning. However, their performance on domains without labeled data still lags behind those with labeled data, which we refer as the Domain Robustness (DR) challenge. Existing research on DR suffers from disparate setups, lack of evaluation task variety, and reliance on challenge sets. In this paper, we explore the DR challenge of both fine-tuned and few-shot learning models in natural domain shift settings. We introduce a DR benchmark comprising diverse NLP tasks, including sentence and token-level classification, QA, and generation, each task consists of several domains. We propose two views of the DR challenge: Source Drop (SD) and Target Drop (TD), which alternate between the source and target in-domain performance as reference points. We find that in significant proportions of domain shifts, either SD or TD is positive, but not both, emphasizing the importance of considering both measures as diagnostic tools. Our experimental results demonstrate the persistent existence of the DR challenge in both fine-tuning and few-shot learning models, though it is less pronounced in the latter. We also find that increasing the fine-tuned model size improves performance, particularly in classification.

View paper on

Share this with someone who'll enjoy it:

Title:Measuring the Robustness of Natural Language Processing Models to Domain Shifts

Paper and Code