Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Thieves on Sesame Street! Model Extraction of BERT-based APIs

Oct 27, 2019

Kalpesh Krishna, Gaurav Singh Tomar, Ankur P. Parikh, Nicolas Papernot, Mohit Iyyer

Figure 1 for Thieves on Sesame Street! Model Extraction of BERT-based APIs

Figure 2 for Thieves on Sesame Street! Model Extraction of BERT-based APIs

Figure 3 for Thieves on Sesame Street! Model Extraction of BERT-based APIs

Figure 4 for Thieves on Sesame Street! Model Extraction of BERT-based APIs

Share this with someone who'll enjoy it:

Abstract:We study the problem of model extraction in natural language processing, in which an adversary with only query access to a victim model attempts to reconstruct a local copy of that model. Assuming that both the adversary and victim model fine-tune a large pretrained language model such as BERT (Devlin et al. 2019), we show that the adversary does not need any real training data to successfully mount the attack. In fact, the attacker need not even use grammatical or semantically meaningful queries: we show that random sequences of words coupled with task-specific heuristics form effective queries for model extraction on a diverse set of NLP tasks including natural language inference and question answering. Our work thus highlights an exploit only made feasible by the shift towards transfer learning methods within the NLP community: for a query budget of a few hundred dollars, an attacker can extract a model that performs only slightly worse than the victim model. Finally, we study two defense strategies against model extraction---membership classification and API watermarking---which while successful against naive adversaries, are ineffective against more sophisticated ones.

* preprint, 18 pages

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Thieves on Sesame Street! Model Extraction of BERT-based APIs

Paper and Code