Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiangyu Zhou

Automatic Calibration for Membership Inference Attack on Large Language Models

May 06, 2025

Saleh Zare Zade, Yao Qiang, Xiangyu Zhou, Hui Zhu, Mohammad Amin Roshani, Prashant Khanduri, Dongxiao Zhu

Abstract:Membership Inference Attacks (MIAs) have recently been employed to determine whether a specific text was part of the pre-training data of Large Language Models (LLMs). However, existing methods often misinfer non-members as members, leading to a high false positive rate, or depend on additional reference models for probability calibration, which limits their practicality. To overcome these challenges, we introduce a novel framework called Automatic Calibration Membership Inference Attack (ACMIA), which utilizes a tunable temperature to calibrate output probabilities effectively. This approach is inspired by our theoretical insights into maximum likelihood estimation during the pre-training of LLMs. We introduce ACMIA in three configurations designed to accommodate different levels of model access and increase the probability gap between members and non-members, improving the reliability and robustness of membership inference. Extensive experiments on various open-source LLMs demonstrate that our proposed attack is highly effective, robust, and generalizable, surpassing state-of-the-art baselines across three widely used benchmarks. Our code is available at: \href{https://github.com/Salehzz/ACMIA}{\textcolor{blue}{Github}}.

Via

Access Paper or Ask Questions

Exploring The Neural Burden In Pruned Models: An Insight Inspired By Neuroscience

Jul 27, 2024

Zeyu Wang, Weichen Dai, Xiangyu Zhou, Ji Qi, Yi Zhou

Abstract:Vision Transformer and its variants have been adopted in many visual tasks due to their powerful capabilities, which also bring significant challenges in computation and storage. Consequently, researchers have introduced various compression methods in recent years, among which the pruning techniques are widely used to remove a significant fraction of the network. Therefore, these methods can reduce significant percent of the FLOPs, but often lead to a decrease in model performance. To investigate the underlying causes, we focus on the pruning methods specifically belonging to the pruning-during-training category, then drew inspiration from neuroscience and propose a new concept for artificial neural network models named Neural Burden. We investigate its impact in the model pruning process, and subsequently explore a simple yet effective approach to mitigate the decline in model performance, which can be applied to any pruning-during-training technique. Extensive experiments indicate that the neural burden phenomenon indeed exists, and show the potential of our method. We hope that our findings can provide valuable insights for future research. Code will be made publicly available after this paper is published.

Via

Access Paper or Ask Questions

Learning to Poison Large Language Models During Instruction Tuning

Feb 21, 2024

Yao Qiang, Xiangyu Zhou, Saleh Zare Zade, Mohammad Amin Roshani, Douglas Zytko, Dongxiao Zhu

Figure 1 for Learning to Poison Large Language Models During Instruction Tuning

Figure 2 for Learning to Poison Large Language Models During Instruction Tuning

Figure 3 for Learning to Poison Large Language Models During Instruction Tuning

Figure 4 for Learning to Poison Large Language Models During Instruction Tuning

Abstract:The advent of Large Language Models (LLMs) has marked significant achievements in language processing and reasoning capabilities. Despite their advancements, LLMs face vulnerabilities to data poisoning attacks, where adversaries insert backdoor triggers into training data to manipulate outputs for malicious purposes. This work further identifies additional security risks in LLMs by designing a new data poisoning attack tailored to exploit the instruction tuning process. We propose a novel gradient-guided backdoor trigger learning approach to identify adversarial triggers efficiently, ensuring an evasion of detection by conventional defenses while maintaining content integrity. Through experimental validation across various LLMs and tasks, our strategy demonstrates a high success rate in compromising model outputs; poisoning only 1\% of 4,000 instruction tuning samples leads to a Performance Drop Rate (PDR) of around 80\%. Our work highlights the need for stronger defenses against data poisoning attack, offering insights into safeguarding LLMs against these more sophisticated attacks. The source code can be found on this GitHub repository: https://github.com/RookieZxy/GBTL/blob/main/README.md.

Via

Access Paper or Ask Questions

Hijacking Large Language Models via Adversarial In-Context Learning

Nov 16, 2023

Yao Qiang, Xiangyu Zhou, Dongxiao Zhu

Figure 1 for Hijacking Large Language Models via Adversarial In-Context Learning

Figure 2 for Hijacking Large Language Models via Adversarial In-Context Learning

Figure 3 for Hijacking Large Language Models via Adversarial In-Context Learning

Figure 4 for Hijacking Large Language Models via Adversarial In-Context Learning

Abstract:In-context learning (ICL) has emerged as a powerful paradigm leveraging LLMs for specific tasks by utilizing labeled examples as demonstrations in the precondition prompts. Despite its promising performance, ICL suffers from instability with the choice and arrangement of examples. Additionally, crafted adversarial attacks pose a notable threat to the robustness of ICL. However, existing attacks are either easy to detect, rely on external models, or lack specificity towards ICL. To address these issues, this work introduces a novel transferable attack for ICL, aiming to hijack LLMs to generate the targeted response. The proposed LLM hijacking attack leverages a gradient-based prompt search method to learn and append imperceptible adversarial suffixes to the in-context demonstrations. Extensive experimental results on various tasks and datasets demonstrate the effectiveness of our LLM hijacking attack, resulting in a distracted attention towards adversarial tokens, consequently leading to the targeted unwanted outputs.

Via

Access Paper or Ask Questions

Efficient Bottom-Up Synthesis for Programs with Local Variables

Nov 07, 2023

Xiang Li, Xiangyu Zhou, Rui Dong, Yihong Zhang, Xinyu Wang

Figure 1 for Efficient Bottom-Up Synthesis for Programs with Local Variables

Figure 2 for Efficient Bottom-Up Synthesis for Programs with Local Variables

Figure 3 for Efficient Bottom-Up Synthesis for Programs with Local Variables

Figure 4 for Efficient Bottom-Up Synthesis for Programs with Local Variables

Abstract:We propose a new synthesis algorithm that can efficiently search programs with local variables (e.g., those introduced by lambdas). Prior bottom-up synthesis algorithms are not able to evaluate programs with free local variables, and therefore cannot effectively reduce the search space of such programs (e.g., using standard observational equivalence reduction techniques), making synthesis slow. Our algorithm can reduce the space of programs with local variables. The key idea, dubbed lifted interpretation, is to lift up the program interpretation process, from evaluating one program at a time to simultaneously evaluating all programs from a grammar. Lifted interpretation provides a mechanism to systematically enumerate all binding contexts for local variables, thereby enabling us to evaluate and reduce the space of programs with local variables. Our ideas are instantiated in the domain of web automation. The resulting tool, Arborist, can automate a significantly broader range of challenging tasks more efficiently than state-of-the-art techniques including WebRobot and Helena.

* Accepted to POPL 2024

Via

Access Paper or Ask Questions

DGP-Net: Dense Graph Prototype Network for Few-Shot SAR Target Recognition

Feb 19, 2023

Xiangyu Zhou, Qianru Wei, Yuhui Zhang

Figure 1 for DGP-Net: Dense Graph Prototype Network for Few-Shot SAR Target Recognition

Figure 2 for DGP-Net: Dense Graph Prototype Network for Few-Shot SAR Target Recognition

Figure 3 for DGP-Net: Dense Graph Prototype Network for Few-Shot SAR Target Recognition

Figure 4 for DGP-Net: Dense Graph Prototype Network for Few-Shot SAR Target Recognition

Abstract:The inevitable feature deviation of synthetic aperture radar (SAR) image due to the special imaging principle (depression angle variation) leads to poor recognition accuracy, especially in few-shot learning (FSL). To deal with this problem, we propose a dense graph prototype network (DGP-Net) to eliminate the feature deviation by learning potential features, and classify by learning feature distribution. The role of the prototype in this model is to solve the problem of large distance between congeneric samples taken due to the contingency of single sampling in FSL, and enhance the robustness of the model. Experimental results on the MSTAR dataset show that the DGP-Net has good classification results for SAR images with different depression angles and the recognition accuracy of it is higher than typical FSL methods.

Via

Access Paper or Ask Questions