Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Saket Maheshwary

A Strong Baseline for Query Efficient Attacks in a Black Box Setting

Sep 10, 2021

Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi

Figure 1 for A Strong Baseline for Query Efficient Attacks in a Black Box Setting

Figure 2 for A Strong Baseline for Query Efficient Attacks in a Black Box Setting

Figure 3 for A Strong Baseline for Query Efficient Attacks in a Black Box Setting

Figure 4 for A Strong Baseline for Query Efficient Attacks in a Black Box Setting

Abstract:Existing black box search methods have achieved high success rate in generating adversarial attacks against NLP models. However, such search methods are inefficient as they do not consider the amount of queries required to generate adversarial attacks. Also, prior attacks do not maintain a consistent search space while comparing different search methods. In this paper, we propose a query efficient attack strategy to generate plausible adversarial examples on text classification and entailment tasks. Our attack jointly leverages attention mechanism and locality sensitive hashing (LSH) to reduce the query count. We demonstrate the efficacy of our approach by comparing our attack with four baselines across three different search spaces. Further, we benchmark our results across the same search space used in prior attacks. In comparison to attacks proposed, on an average, we are able to reduce the query count by 75% across all datasets and target models. We also demonstrate that our attack achieves a higher success rate when compared to prior attacks in a limited query setting.

* EMNLP 2021 - Main Conference

Via

Access Paper or Ask Questions

Generating Natural Language Attacks in a Hard Label Black Box Setting

Dec 29, 2020

Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi

Figure 1 for Generating Natural Language Attacks in a Hard Label Black Box Setting

Figure 2 for Generating Natural Language Attacks in a Hard Label Black Box Setting

Figure 3 for Generating Natural Language Attacks in a Hard Label Black Box Setting

Figure 4 for Generating Natural Language Attacks in a Hard Label Black Box Setting

Abstract:We study an important and challenging task of attacking natural language processing models in a hard label black box setting. We propose a decision-based attack strategy that crafts high quality adversarial examples on text classification and entailment tasks. Our proposed attack strategy leverages population-based optimization algorithm to craft plausible and semantically similar adversarial examples by observing only the top label predicted by the target model. At each iteration, the optimization procedure allow word replacements that maximizes the overall semantic similarity between the original and the adversarial text. Further, our approach does not rely on using substitute models or any kind of training data. We demonstrate the efficacy of our proposed approach through extensive experimentation and ablation studies on five state-of-the-art target models across seven benchmark datasets. In comparison to attacks proposed in prior literature, we are able to achieve a higher success rate with lower word perturbation percentage that too in a highly restricted setting.

* Accepted at AAAI 2021 (Main Conference)

Via

Access Paper or Ask Questions

A Context Aware Approach for Generating Natural Language Attacks

Dec 24, 2020

Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi

Figure 1 for A Context Aware Approach for Generating Natural Language Attacks

Figure 2 for A Context Aware Approach for Generating Natural Language Attacks

Abstract:We study an important task of attacking natural language processing models in a black box setting. We propose an attack strategy that crafts semantically similar adversarial examples on text classification and entailment tasks. Our proposed attack finds candidate words by considering the information of both the original word and its surrounding context. It jointly leverages masked language modelling and next sentence prediction for context understanding. In comparison to attacks proposed in prior literature, we are able to generate high quality adversarial examples that do significantly better both in terms of success rate and word perturbation percentage.

* Accepted as Student Poster at AAAI 2021

Via

Access Paper or Ask Questions