Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Rethink the Effectiveness of Text Data Augmentation: An Empirical Analysis

Jun 13, 2023

Zhengxiang Shi, Aldo Lipani

Figure 1 for Rethink the Effectiveness of Text Data Augmentation: An Empirical Analysis

Figure 2 for Rethink the Effectiveness of Text Data Augmentation: An Empirical Analysis

Figure 3 for Rethink the Effectiveness of Text Data Augmentation: An Empirical Analysis

Figure 4 for Rethink the Effectiveness of Text Data Augmentation: An Empirical Analysis

Share this with someone who'll enjoy it:

Abstract:In recent years, language models (LMs) have made remarkable progress in advancing the field of natural language processing (NLP). However, the impact of data augmentation (DA) techniques on the fine-tuning (FT) performance of these LMs has been a topic of ongoing debate. In this study, we evaluate the effectiveness of three different FT methods in conjugation with back-translation across an array of 7 diverse NLP tasks, including classification and regression types, covering single-sentence and sentence-pair tasks. Contrary to prior assumptions that DA does not contribute to the enhancement of LMs' FT performance, our findings reveal that continued pre-training on augmented data can effectively improve the FT performance of the downstream tasks. In the most favourable case, continued pre-training improves the performance of FT by more than 10% in the few-shot learning setting. Our finding highlights the potential of DA as a powerful tool for bolstering LMs' performance.

* Accepted at ESANN 2023

View paper on

Share this with someone who'll enjoy it:

Title:Rethink the Effectiveness of Text Data Augmentation: An Empirical Analysis

Paper and Code