Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shanqing Guo

Jailbreaking Text-to-Image Models with LLM-Based Agents

Aug 01, 2024

Yingkai Dong, Zheng Li, Xiangtao Meng, Ning Yu, Shanqing Guo

Figure 1 for Jailbreaking Text-to-Image Models with LLM-Based Agents

Figure 2 for Jailbreaking Text-to-Image Models with LLM-Based Agents

Figure 3 for Jailbreaking Text-to-Image Models with LLM-Based Agents

Figure 4 for Jailbreaking Text-to-Image Models with LLM-Based Agents

Abstract:Recent advancements have significantly improved automated task-solving capabilities using autonomous agents powered by large language models (LLMs). However, most LLM-based agents focus on dialogue, programming, or specialized domains, leaving gaps in addressing generative AI safety tasks. These gaps are primarily due to the challenges posed by LLM hallucinations and the lack of clear guidelines. In this paper, we propose Atlas, an advanced LLM-based multi-agent framework that integrates an efficient fuzzing workflow to target generative AI models, specifically focusing on jailbreak attacks against text-to-image (T2I) models with safety filters. Atlas utilizes a vision-language model (VLM) to assess whether a prompt triggers the T2I model's safety filter. It then iteratively collaborates with both LLM and VLM to generate an alternative prompt that bypasses the filter. Atlas also enhances the reasoning abilities of LLMs in attack scenarios by leveraging multi-agent communication, in-context learning (ICL) memory mechanisms, and the chain-of-thought (COT) approach. Our evaluation demonstrates that Atlas successfully jailbreaks several state-of-the-art T2I models in a black-box setting, which are equipped with multi-modal safety filters. In addition, Atlas outperforms existing methods in both query efficiency and the quality of the generated images.

Via

Access Paper or Ask Questions

AVA: Inconspicuous Attribute Variation-based Adversarial Attack bypassing DeepFake Detection

Dec 14, 2023

Xiangtao Meng, Li Wang, Shanqing Guo, Lei Ju, Qingchuan Zhao

Abstract:While DeepFake applications are becoming popular in recent years, their abuses pose a serious privacy threat. Unfortunately, most related detection algorithms to mitigate the abuse issues are inherently vulnerable to adversarial attacks because they are built atop DNN-based classification models, and the literature has demonstrated that they could be bypassed by introducing pixel-level perturbations. Though corresponding mitigation has been proposed, we have identified a new attribute-variation-based adversarial attack (AVA) that perturbs the latent space via a combination of Gaussian prior and semantic discriminator to bypass such mitigation. It perturbs the semantics in the attribute space of DeepFake images, which are inconspicuous to human beings (e.g., mouth open) but can result in substantial differences in DeepFake detection. We evaluate our proposed AVA attack on nine state-of-the-art DeepFake detection algorithms and applications. The empirical results demonstrate that AVA attack defeats the state-of-the-art black box attacks against DeepFake detectors and achieves more than a 95% success rate on two commercial DeepFake detectors. Moreover, our human study indicates that AVA-generated DeepFake images are often imperceptible to humans, which presents huge security and privacy concerns.

Via

Access Paper or Ask Questions

RNN-Guard: Certified Robustness Against Multi-frame Attacks for Recurrent Neural Networks

Apr 17, 2023

Yunruo Zhang, Tianyu Du, Shouling Ji, Peng Tang, Shanqing Guo

Abstract:It is well-known that recurrent neural networks (RNNs), although widely used, are vulnerable to adversarial attacks including one-frame attacks and multi-frame attacks. Though a few certified defenses exist to provide guaranteed robustness against one-frame attacks, we prove that defending against multi-frame attacks remains a challenging problem due to their enormous perturbation space. In this paper, we propose the first certified defense against multi-frame attacks for RNNs called RNN-Guard. To address the above challenge, we adopt the perturb-all-frame strategy to construct perturbation spaces consistent with those in multi-frame attacks. However, the perturb-all-frame strategy causes a precision issue in linear relaxations. To address this issue, we introduce a novel abstract domain called InterZono and design tighter relaxations. We prove that InterZono is more precise than Zonotope yet carries the same time complexity. Experimental evaluations across various datasets and model structures show that the certified robust accuracy calculated by RNN-Guard with InterZono is up to 2.18 times higher than that with Zonotope. In addition, we extend RNN-Guard as the first certified training method against multi-frame attacks to directly enhance RNNs' robustness. The results show that the certified robust accuracy of models trained with RNN-Guard against multi-frame attacks is 15.47 to 67.65 percentage points higher than those with other training methods.

* 13 pages, 7 figures, 6 tables

Via

Access Paper or Ask Questions

Seeing is Living? Rethinking the Security of Facial Liveness Verification in the Deepfake Era

Feb 22, 2022

Changjiang Li, Li Wang, Shouling Ji, Xuhong Zhang, Zhaohan Xi, Shanqing Guo, Ting Wang

Figure 1 for Seeing is Living? Rethinking the Security of Facial Liveness Verification in the Deepfake Era

Figure 2 for Seeing is Living? Rethinking the Security of Facial Liveness Verification in the Deepfake Era

Figure 3 for Seeing is Living? Rethinking the Security of Facial Liveness Verification in the Deepfake Era

Figure 4 for Seeing is Living? Rethinking the Security of Facial Liveness Verification in the Deepfake Era

Abstract:Facial Liveness Verification (FLV) is widely used for identity authentication in many security-sensitive domains and offered as Platform-as-a-Service (PaaS) by leading cloud vendors. Yet, with the rapid advances in synthetic media techniques (e.g., deepfake), the security of FLV is facing unprecedented challenges, about which little is known thus far. To bridge this gap, in this paper, we conduct the first systematic study on the security of FLV in real-world settings. Specifically, we present LiveBugger, a new deepfake-powered attack framework that enables customizable, automated security evaluation of FLV. Leveraging LiveBugger, we perform a comprehensive empirical assessment of representative FLV platforms, leading to a set of interesting findings. For instance, most FLV APIs do not use anti-deepfake detection; even for those with such defenses, their effectiveness is concerning (e.g., it may detect high-quality synthesized videos but fail to detect low-quality ones). We then conduct an in-depth analysis of the factors impacting the attack performance of LiveBugger: a) the bias (e.g., gender or race) in FLV can be exploited to select victims; b) adversarial training makes deepfake more effective to bypass FLV; c) the input quality has a varying influence on different deepfake techniques to bypass FLV. Based on these findings, we propose a customized, two-stage approach that can boost the attack success rate by up to 70%. Further, we run proof-of-concept attacks on several representative applications of FLV (i.e., the clients of FLV APIs) to illustrate the practical implications: due to the vulnerability of the APIs, many downstream applications are vulnerable to deepfake. Finally, we discuss potential countermeasures to improve the security of FLV. Our findings have been confirmed by the corresponding vendors.

* Accepted as a full paper at USENIX Security '22

Via

Access Paper or Ask Questions

SoK: A Modularized Approach to Study the Security of Automatic Speech Recognition Systems

Mar 19, 2021

Yuxuan Chen, Jiangshan Zhang, Xuejing Yuan, Shengzhi Zhang, Kai Chen, Xiaofeng Wang, Shanqing Guo

Figure 1 for SoK: A Modularized Approach to Study the Security of Automatic Speech Recognition Systems

Figure 2 for SoK: A Modularized Approach to Study the Security of Automatic Speech Recognition Systems

Figure 3 for SoK: A Modularized Approach to Study the Security of Automatic Speech Recognition Systems

Figure 4 for SoK: A Modularized Approach to Study the Security of Automatic Speech Recognition Systems

Abstract:With the wide use of Automatic Speech Recognition (ASR) in applications such as human machine interaction, simultaneous interpretation, audio transcription, etc., its security protection becomes increasingly important. Although recent studies have brought to light the weaknesses of popular ASR systems that enable out-of-band signal attack, adversarial attack, etc., and further proposed various remedies (signal smoothing, adversarial training, etc.), a systematic understanding of ASR security (both attacks and defenses) is still missing, especially on how realistic such threats are and how general existing protection could be. In this paper, we present our systematization of knowledge for ASR security and provide a comprehensive taxonomy for existing work based on a modularized workflow. More importantly, we align the research in this domain with that on security in Image Recognition System (IRS), which has been extensively studied, using the domain knowledge in the latter to help understand where we stand in the former. Generally, both IRS and ASR are perceptual systems. Their similarities allow us to systematically study existing literature in ASR security based on the spectrum of attacks and defense solutions proposed for IRS, and pinpoint the directions of more advanced attacks and the directions potentially leading to more effective protection in ASR. In contrast, their differences, especially the complexity of ASR compared with IRS, help us learn unique challenges and opportunities in ASR security. Particularly, our experimental study shows that transfer learning across ASR models is feasible, even in the absence of knowledge about models (even their types) and training data.

* 17 pages

Via

Access Paper or Ask Questions

DeepStego: Protecting Intellectual Property of Deep Neural Networks by Steganography

Mar 13, 2019

Zheng Li, Ge Han, Shanqing Guo, Chengyu Hu

Figure 1 for DeepStego: Protecting Intellectual Property of Deep Neural Networks by Steganography

Figure 2 for DeepStego: Protecting Intellectual Property of Deep Neural Networks by Steganography

Figure 3 for DeepStego: Protecting Intellectual Property of Deep Neural Networks by Steganography

Figure 4 for DeepStego: Protecting Intellectual Property of Deep Neural Networks by Steganography

Abstract:Deep Neural Networks (DNNs) has shown great success in various challenging tasks. Training these networks is computationally expensive and requires vast amounts of training data. Therefore, it is necessary to design a technology to protect the intellectual property (IP) of the model and externally verify the ownership of the model in a black-box way. Previous studies either fail to meet the black-box requirement or have not dealt with several forms of security and legal problems. In this paper, we firstly propose a novel steganographic scheme for watermarking Deep Neural Networks in the process of training. This scheme is the first feasible scheme to protect DNNs which perfectly solves the problems of safety and legality. We demonstrate experimentally that such a watermark has no obvious influence on the main task of model design and can successfully verify the ownership of the model. Furthermore, we show a rather robustness by simulating our scheme in a real situation.

* Some experiments need to be done

Via

Access Paper or Ask Questions

Learning Symmetric and Asymmetric Steganography via Adversarial Training

Mar 13, 2019

Zheng Li, Ge Han, Yunqing Wei, Shanqing Guo

Figure 1 for Learning Symmetric and Asymmetric Steganography via Adversarial Training

Figure 2 for Learning Symmetric and Asymmetric Steganography via Adversarial Training

Figure 3 for Learning Symmetric and Asymmetric Steganography via Adversarial Training

Figure 4 for Learning Symmetric and Asymmetric Steganography via Adversarial Training

Abstract:Steganography refers to the art of concealing secret messages within multiple media carriers so that an eavesdropper is unable to detect the presence and content of the hidden messages. In this paper, we firstly propose a novel key-dependent steganographic scheme that achieves steganographic objectives with adversarial training. Symmetric (secret-key) and Asymmetric (public-key) steganographic scheme are separately proposed and each scheme is successfully designed and implemented. We show that these encodings produced by our scheme improve the invisibility by 20% than previous deep-leanring-based work, and further that perform competitively remarkable undetectability 25% better than classic steganographic algorithms. Finally, we simulated our scheme in a real situation where the decoder achieved an accuracy of more than 98% of the original message.

Via

Access Paper or Ask Questions