Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Impact of Adversarial Training on Robustness and Generalizability of Language Models

Nov 10, 2022

Enes Altinisik, Hassan Sajjad, Husrev Taha Sencar, Safa Messaoud, Sanjay Chawla

Figure 1 for Impact of Adversarial Training on Robustness and Generalizability of Language Models

Figure 2 for Impact of Adversarial Training on Robustness and Generalizability of Language Models

Figure 3 for Impact of Adversarial Training on Robustness and Generalizability of Language Models

Figure 4 for Impact of Adversarial Training on Robustness and Generalizability of Language Models

Share this with someone who'll enjoy it:

Abstract:Adversarial training is widely acknowledged as the most effective defense against adversarial attacks. However, it is also well established that achieving both robustness and generalization in adversarially trained models involves a trade-off. The goal of this work is to provide an in depth comparison of different approaches for adversarial training in language models. Specifically, we study the effect of pre-training data augmentation as well as training time input perturbations vs. embedding space perturbations on the robustness and generalization of BERT-like language models. Our findings suggest that better robustness can be achieved by pre-training data augmentation or by training with input space perturbation. However, training with embedding space perturbation significantly improves generalization. A linguistic correlation analysis of neurons of the learned models reveal that the improved generalization is due to `more specialized' neurons. To the best of our knowledge, this is the first work to carry out a deep qualitative analysis of different methods of generating adversarial examples in adversarial training of language models.

View paper on

Share this with someone who'll enjoy it:

Title:Impact of Adversarial Training on Robustness and Generalizability of Language Models

Paper and Code