Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Defending against Backdoor Attacks in Natural Language Generation

Jun 03, 2021

Chun Fan, Xiaoya Li, Yuxian Meng, Xiaofei Sun, Xiang Ao, Fei Wu, Jiwei Li, Tianwei Zhang

Figure 1 for Defending against Backdoor Attacks in Natural Language Generation

Figure 2 for Defending against Backdoor Attacks in Natural Language Generation

Figure 3 for Defending against Backdoor Attacks in Natural Language Generation

Figure 4 for Defending against Backdoor Attacks in Natural Language Generation

Share this with someone who'll enjoy it:

Abstract:The frustratingly fragile nature of neural network models make current natural language generation (NLG) systems prone to backdoor attacks and generate malicious sequences that could be sexist or offensive. Unfortunately, little effort has been invested to how backdoor attacks can affect current NLG models and how to defend against these attacks. In this work, we investigate this problem on two important NLG tasks, machine translation and dialogue generation. By giving a formal definition for backdoor attack and defense, and developing corresponding benchmarks, we design methods to attack NLG models, which achieve high attack success to ask NLG models to generate malicious sequences. To defend against these attacks, we propose to detect the attack trigger by examining the effect of deleting or replacing certain words on the generation outputs, which we find successful for certain types of attacks. We will discuss the limitation of this work, and hope this work can raise the awareness of backdoor risks concealed in deep NLG systems. (Code and data are available at https://github.com/ShannonAI/backdoor_nlg.)

View paper on

Share this with someone who'll enjoy it:

Title:Defending against Backdoor Attacks in Natural Language Generation

Paper and Code