Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mingze Ni

Cross-Entropy Attacks to Language Models via Rare Event Simulation

Jan 21, 2025

Mingze Ni, Yongshun Gong, Wei Liu

Figure 1 for Cross-Entropy Attacks to Language Models via Rare Event Simulation

Figure 2 for Cross-Entropy Attacks to Language Models via Rare Event Simulation

Figure 3 for Cross-Entropy Attacks to Language Models via Rare Event Simulation

Figure 4 for Cross-Entropy Attacks to Language Models via Rare Event Simulation

Abstract:Black-box textual adversarial attacks are challenging due to the lack of model information and the discrete, non-differentiable nature of text. Existing methods often lack versatility for attacking different models, suffer from limited attacking performance due to the inefficient optimization with word saliency ranking, and frequently sacrifice semantic integrity to achieve better attack outcomes. This paper introduces a novel approach to textual adversarial attacks, which we call Cross-Entropy Attacks (CEA), that uses Cross-Entropy optimization to address the above issues. Our CEA approach defines adversarial objectives for both soft-label and hard-label settings and employs CE optimization to identify optimal replacements. Through extensive experiments on document classification and language translation problems, we demonstrate that our attack method excels in terms of attacking performance, imperceptibility, and sentence quality.

Via

Access Paper or Ask Questions

Deceiving Question-Answering Models: A Hybrid Word-Level Adversarial Approach

Nov 12, 2024

Jiyao Li, Mingze Ni, Yongshun Gong, Wei Liu

Figure 1 for Deceiving Question-Answering Models: A Hybrid Word-Level Adversarial Approach

Figure 2 for Deceiving Question-Answering Models: A Hybrid Word-Level Adversarial Approach

Figure 3 for Deceiving Question-Answering Models: A Hybrid Word-Level Adversarial Approach

Figure 4 for Deceiving Question-Answering Models: A Hybrid Word-Level Adversarial Approach

Abstract:Deep learning underpins most of the currently advanced natural language processing (NLP) tasks such as textual classification, neural machine translation (NMT), abstractive summarization and question-answering (QA). However, the robustness of the models, particularly QA models, against adversarial attacks is a critical concern that remains insufficiently explored. This paper introduces QA-Attack (Question Answering Attack), a novel word-level adversarial strategy that fools QA models. Our attention-based attack exploits the customized attention mechanism and deletion ranking strategy to identify and target specific words within contextual passages. It creates deceptive inputs by carefully choosing and substituting synonyms, preserving grammatical integrity while misleading the model to produce incorrect responses. Our approach demonstrates versatility across various question types, particularly when dealing with extensive long textual inputs. Extensive experiments on multiple benchmark datasets demonstrate that QA-Attack successfully deceives baseline QA models and surpasses existing adversarial techniques regarding success rate, semantics changes, BLEU score, fluency and grammar error rate.

Via

Access Paper or Ask Questions

Dreaming is All You Need

Sep 03, 2024

Mingze Ni, Wei Liu

Abstract:In classification tasks, achieving a harmonious balance between exploration and precision is of paramount importance. To this end, this research introduces two novel deep learning models, SleepNet and DreamNet, to strike this balance. SleepNet seamlessly integrates supervised learning with unsupervised ``sleep" stages using pre-trained encoder models. Dedicated neurons within SleepNet are embedded in these unsupervised features, forming intermittent ``sleep" blocks that facilitate exploratory learning. Building upon the foundation of SleepNet, DreamNet employs full encoder-decoder frameworks to reconstruct the hidden states, mimicking the human "dreaming" process. This reconstruction process enables further exploration and refinement of the learned representations. Moreover, the principle ideas of our SleepNet and DreamNet are generic and can be applied to both computer vision and natural language processing downstream tasks. Through extensive empirical evaluations on diverse image and text datasets, SleepNet and DreanNet have demonstrated superior performance compared to state-of-the-art models, showcasing the strengths of unsupervised exploration and supervised precision afforded by our innovative approaches.

Via

Access Paper or Ask Questions

Reversible Jump Attack to Textual Classifiers with Modification Reduction

Mar 21, 2024

Mingze Ni, Zhensu Sun, Wei Liu

Figure 1 for Reversible Jump Attack to Textual Classifiers with Modification Reduction

Figure 2 for Reversible Jump Attack to Textual Classifiers with Modification Reduction

Figure 3 for Reversible Jump Attack to Textual Classifiers with Modification Reduction

Figure 4 for Reversible Jump Attack to Textual Classifiers with Modification Reduction

Abstract:Recent studies on adversarial examples expose vulnerabilities of natural language processing (NLP) models. Existing techniques for generating adversarial examples are typically driven by deterministic hierarchical rules that are agnostic to the optimal adversarial examples, a strategy that often results in adversarial samples with a suboptimal balance between magnitudes of changes and attack successes. To this end, in this research we propose two algorithms, Reversible Jump Attack (RJA) and Metropolis-Hasting Modification Reduction (MMR), to generate highly effective adversarial examples and to improve the imperceptibility of the examples, respectively. RJA utilizes a novel randomization mechanism to enlarge the search space and efficiently adapts to a number of perturbed words for adversarial examples. With these generated adversarial examples, MMR applies the Metropolis-Hasting sampler to enhance the imperceptibility of adversarial examples. Extensive experiments demonstrate that RJA-MMR outperforms current state-of-the-art methods in attack performance, imperceptibility, fluency and grammar correctness.

Via

Access Paper or Ask Questions

AICAttack: Adversarial Image Captioning Attack with Attention-Based Optimization

Feb 20, 2024

Jiyao Li, Mingze Ni, Yifei Dong, Tianqing Zhu, Wei Liu

Abstract:Recent advances in deep learning research have shown remarkable achievements across many tasks in computer vision (CV) and natural language processing (NLP). At the intersection of CV and NLP is the problem of image captioning, where the related models' robustness against adversarial attacks has not been well studied. In this paper, we present a novel adversarial attack strategy, which we call AICAttack (Attention-based Image Captioning Attack), designed to attack image captioning models through subtle perturbations on images. Operating within a black-box attack scenario, our algorithm requires no access to the target model's architecture, parameters, or gradient information. We introduce an attention-based candidate selection mechanism that identifies the optimal pixels to attack, followed by Differential Evolution (DE) for perturbing pixels' RGB values. We demonstrate AICAttack's effectiveness through extensive experiments on benchmark datasets with multiple victim models. The experimental results demonstrate that our method surpasses current leading-edge techniques by effectively distributing the alignment and semantics of words in the output.

Via

Access Paper or Ask Questions

Frauds Bargain Attack: Generating Adversarial Text Samples via Word Manipulation Process

Mar 01, 2023

Mingze Ni, Zhensu Sun, Wei Liu

Figure 1 for Frauds Bargain Attack: Generating Adversarial Text Samples via Word Manipulation Process

Figure 2 for Frauds Bargain Attack: Generating Adversarial Text Samples via Word Manipulation Process

Figure 3 for Frauds Bargain Attack: Generating Adversarial Text Samples via Word Manipulation Process

Figure 4 for Frauds Bargain Attack: Generating Adversarial Text Samples via Word Manipulation Process

Abstract:Recent studies on adversarial examples expose vulnerabilities of natural language processing (NLP) models. Existing techniques for generating adversarial examples are typically driven by deterministic heuristic rules that are agnostic to the optimal adversarial examples, a strategy that often results in attack failures. To this end, this research proposes Fraud's Bargain Attack (FBA) which utilizes a novel randomization mechanism to enlarge the search space and enables high-quality adversarial examples to be generated with high probabilities. FBA applies the Metropolis-Hasting sampler, a member of Markov Chain Monte Carlo samplers, to enhance the selection of adversarial examples from all candidates proposed by a customized stochastic process that we call the Word Manipulation Process (WMP). WMP perturbs one word at a time via insertion, removal or substitution in a contextual-aware manner. Extensive experiments demonstrate that FBA outperforms the state-of-the-art methods in terms of both attack success rate and imperceptibility.

* 21 pages, 9 tables, 3 figures

Via

Access Paper or Ask Questions

Learning to Prevent Profitless Neural Code Completion

Sep 13, 2022

Zhensu Sun, Xiaoning Du, Fu Song, Shangwen Wang, Mingze Ni, Li Li

Figure 1 for Learning to Prevent Profitless Neural Code Completion

Figure 2 for Learning to Prevent Profitless Neural Code Completion

Figure 3 for Learning to Prevent Profitless Neural Code Completion

Figure 4 for Learning to Prevent Profitless Neural Code Completion

Abstract:Currently, large pre-trained models are widely applied in neural code completion systems, such as Github Copilot, aiXcoder, and TabNine. Though large models significantly outperform their smaller counterparts, a survey with 2,631 participants reveals that around 70\% displayed code completions from Copilot are not accepted by developers. Being reviewed but not accepted, these completions bring a threat to productivity. Besides, considering the high cost of the large models, it is a huge waste of computing resources and energy, which severely goes against the sustainable development principle of AI technologies. Additionally, in code completion systems, the completion requests are automatically and actively issued to the models as developers type out, which significantly aggravates the workload. However, to the best of our knowledge, such waste has never been realized, not to mention effectively addressed, in the context of neural code completion. Hence, preventing such profitless code completions from happening in a cost-friendly way is of urgent need. To fill this gap, we first investigate the prompts of these completions and find four observable prompt patterns, which demonstrate the feasibility of identifying such prompts based on prompts themselves. Motivated by this finding, we propose an early-rejection mechanism to turn down low-return prompts by foretelling the completion qualities without sending them to the LCM. Further, we propose a lightweight Transformer-based estimator to demonstrate the feasibility of the mechanism. The experimental results show that the estimator rejects low-return prompts with a promising accuracy of 83.2%.

* 10 pages

Via

Access Paper or Ask Questions

CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning

Oct 25, 2021

Zhensu Sun, Xiaoning Du, Fu Song, Mingze Ni, Li Li

Figure 1 for CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning

Figure 2 for CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning

Figure 3 for CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning

Figure 4 for CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning

Abstract:Github Copilot, trained on billions of lines of public code, has recently become the buzzword in the computer science research and practice community. Although it is designed to provide powerful intelligence to help developers implement safe and effective code, practitioners and researchers raise concerns about its ethical and security problems, e.g., should the copyleft licensed code be freely leveraged or insecure code be considered for training in the first place? These problems pose a significant impact on Copilot and other similar products that aim to learn knowledge from large-scale source code through deep learning models, which are inevitably on the rise with the fast development of artificial intelligence. To mitigate such impacts, we argue that there is a need to invent effective mechanisms for protecting open-source code from being exploited by deep learning models. To this end, we design and implement a prototype, CoProtector, which utilizes data poisoning techniques to arm source code repositories for defending against such exploits. Our large-scale experiments empirically show that CoProtector is effective in achieving its purpose, significantly reducing the performance of Copilot-like deep learning models while being able to stably reveal the secretly embedded watermark backdoors.

* 8 pages

Via

Access Paper or Ask Questions