Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kehuan Zhang

One Surrogate to Fool Them All: Universal, Transferable, and Targeted Adversarial Attacks with CLIP

May 26, 2025

Binyan Xu, Xilin Dai, Di Tang, Kehuan Zhang

Abstract:Deep Neural Networks (DNNs) have achieved widespread success yet remain prone to adversarial attacks. Typically, such attacks either involve frequent queries to the target model or rely on surrogate models closely mirroring the target model -- often trained with subsets of the target model's training data -- to achieve high attack success rates through transferability. However, in realistic scenarios where training data is inaccessible and excessive queries can raise alarms, crafting adversarial examples becomes more challenging. In this paper, we present UnivIntruder, a novel attack framework that relies solely on a single, publicly available CLIP model and publicly available datasets. By using textual concepts, UnivIntruder generates universal, transferable, and targeted adversarial perturbations that mislead DNNs into misclassifying inputs into adversary-specified classes defined by textual concepts. Our extensive experiments show that our approach achieves an Attack Success Rate (ASR) of up to 85% on ImageNet and over 99% on CIFAR-10, significantly outperforming existing transfer-based methods. Additionally, we reveal real-world vulnerabilities, showing that even without querying target models, UnivIntruder compromises image search engines like Google and Baidu with ASR rates up to 84%, and vision language models like GPT-4 and Claude-3.5 with ASR rates up to 80%. These findings underscore the practicality of our attack in scenarios where traditional avenues are blocked, highlighting the need to reevaluate security paradigms in AI applications.

* 21 pages, 15 figures, 18 tables To appear in the Proceedings of The ACM Conference on Computer and Communications Security (CCS), 2025

Via

Access Paper or Ask Questions

Nowhere to Hide: A Lightweight Unsupervised Detector against Adversarial Examples

Oct 16, 2022

Hui Liu, Bo Zhao, Kehuan Zhang, Peng Liu

Figure 1 for Nowhere to Hide: A Lightweight Unsupervised Detector against Adversarial Examples

Figure 2 for Nowhere to Hide: A Lightweight Unsupervised Detector against Adversarial Examples

Figure 3 for Nowhere to Hide: A Lightweight Unsupervised Detector against Adversarial Examples

Figure 4 for Nowhere to Hide: A Lightweight Unsupervised Detector against Adversarial Examples

Abstract:Although deep neural networks (DNNs) have shown impressive performance on many perceptual tasks, they are vulnerable to adversarial examples that are generated by adding slight but maliciously crafted perturbations to benign images. Adversarial detection is an important technique for identifying adversarial examples before they are entered into target DNNs. Previous studies to detect adversarial examples either targeted specific attacks or required expensive computation. How design a lightweight unsupervised detector is still a challenging problem. In this paper, we propose an AutoEncoder-based Adversarial Examples (AEAE) detector, that can guard DNN models by detecting adversarial examples with low computation in an unsupervised manner. The AEAE includes only a shallow autoencoder but plays two roles. First, a well-trained autoencoder has learned the manifold of benign examples. This autoencoder can produce a large reconstruction error for adversarial images with large perturbations, so we can detect significantly perturbed adversarial examples based on the reconstruction error. Second, the autoencoder can filter out the small noise and change the DNN's prediction on adversarial examples with small perturbations. It helps to detect slightly perturbed adversarial examples based on the prediction distance. To cover these two cases, we utilize the reconstruction error and prediction distance from benign images to construct a two-tuple feature set and train an adversarial detector using the isolation forest algorithm. We show empirically that the AEAE is unsupervised and inexpensive against the most state-of-the-art attacks. Through the detection in these two cases, there is nowhere to hide adversarial examples.

Via

Access Paper or Ask Questions

Towards Evaluating and Training Verifiably Robust Neural Networks

Apr 05, 2021

Zhaoyang Lyu, Minghao Guo, Tong Wu, Guodong Xu, Kehuan Zhang, Dahua Lin

Figure 1 for Towards Evaluating and Training Verifiably Robust Neural Networks

Figure 2 for Towards Evaluating and Training Verifiably Robust Neural Networks

Figure 3 for Towards Evaluating and Training Verifiably Robust Neural Networks

Figure 4 for Towards Evaluating and Training Verifiably Robust Neural Networks

Abstract:Recent works have shown that interval bound propagation (IBP) can be used to train verifiably robust neural networks. Reseachers observe an intriguing phenomenon on these IBP trained networks: CROWN, a bounding method based on tight linear relaxation, often gives very loose bounds on these networks. We also observe that most neurons become dead during the IBP training process, which could hurt the representation capability of the network. In this paper, we study the relationship between IBP and CROWN, and prove that CROWN is always tighter than IBP when choosing appropriate bounding lines. We further propose a relaxed version of CROWN, linear bound propagation (LBP), that can be used to verify large networks to obtain lower verified errors than IBP. We also design a new activation function, parameterized ramp function (ParamRamp), which has more diversity of neuron status than ReLU. We conduct extensive experiments on MNIST, CIFAR-10 and Tiny-ImageNet with ParamRamp activation and achieve state-of-the-art verified robustness. Code and the appendix are available at https://github.com/ZhaoyangLyu/VerifiablyRobustNN.

* Accepted to CVPR 2021 (Oral)

Via

Access Paper or Ask Questions

Query-Free Attacks on Industry-Grade Face Recognition Systems under Resource Constraints

Aug 22, 2018

Di Tang, XiaoFeng Wang, Kehuan Zhang

Figure 1 for Query-Free Attacks on Industry-Grade Face Recognition Systems under Resource Constraints

Figure 2 for Query-Free Attacks on Industry-Grade Face Recognition Systems under Resource Constraints

Figure 3 for Query-Free Attacks on Industry-Grade Face Recognition Systems under Resource Constraints

Figure 4 for Query-Free Attacks on Industry-Grade Face Recognition Systems under Resource Constraints

Abstract:To launch black-box attacks against a Deep Neural Network (DNN) based Face Recognition (FR) system, one needs to build \textit{substitute} models to simulate the target model, so the adversarial examples discovered from substitute models could also mislead the target model. Such \textit{transferability} is achieved in recent studies through querying the target model to obtain data for training the substitute models. A real-world target, likes the FR system of law enforcement, however, is less accessible to the adversary. To attack such a system, a substitute model with similar quality as the target model is needed to identify their common defects. This is hard since the adversary often does not have the enough resources to train such a powerful model (hundreds of millions of images and rooms of GPUs are needed to train a commercial FR system). We found in our research, however, that a resource-constrained adversary could still effectively approximate the target model's capability to recognize \textit{specific} individuals, by training \textit{biased} substitute models on additional images of those victims whose identities the attacker want to cover or impersonate. This is made possible by a new property we discovered, called \textit{Nearly Local Linearity} (NLL), which models the observation that an ideal DNN model produces the image representations (embeddings) whose distances among themselves truthfully describe the human perception of the differences among the input images. By simulating this property around the victim's images, we significantly improve the transferability of black-box impersonation attacks by nearly 50\%. Particularly, we successfully attacked a commercial system trained over 20 million images, using 4 million images and 1/5 of the training time but achieving 62\% transferability in an impersonation attack and 89\% in a dodging attack.

Via

Access Paper or Ask Questions

Face Flashing: a Secure Liveness Detection Protocol based on Light Reflections

Aug 22, 2018

Di Tang, Zhe Zhou, Yinqian Zhang, Kehuan Zhang

Figure 1 for Face Flashing: a Secure Liveness Detection Protocol based on Light Reflections

Figure 2 for Face Flashing: a Secure Liveness Detection Protocol based on Light Reflections

Figure 3 for Face Flashing: a Secure Liveness Detection Protocol based on Light Reflections

Figure 4 for Face Flashing: a Secure Liveness Detection Protocol based on Light Reflections

Abstract:Face authentication systems are becoming increasingly prevalent, especially with the rapid development of Deep Learning technologies. However, human facial information is easy to be captured and reproduced, which makes face authentication systems vulnerable to various attacks. Liveness detection is an important defense technique to prevent such attacks, but existing solutions did not provide clear and strong security guarantees, especially in terms of time. To overcome these limitations, we propose a new liveness detection protocol called Face Flashing that significantly increases the bar for launching successful attacks on face authentication systems. By randomly flashing well-designed pictures on a screen and analyzing the reflected light, our protocol has leveraged physical characteristics of human faces: reflection processing at the speed of light, unique textual features, and uneven 3D shapes. Cooperating with working mechanism of the screen and digital cameras, our protocol is able to detect subtle traces left by an attacking process. To demonstrate the effectiveness of Face Flashing, we implemented a prototype and performed thorough evaluations with large data set collected from real-world scenarios. The results show that our Timing Verification can effectively detect the time gap between legitimate authentications and malicious cases. Our Face Verification can also differentiate 2D plane from 3D objects accurately. The overall accuracy of our liveness detection system is 98.8\%, and its robustness was evaluated in different scenarios. In the worst case, our system's accuracy decreased to a still-high 97.3\%.

Via

Access Paper or Ask Questions