Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wenbo Jiang

BadRefSR: Backdoor Attacks Against Reference-based Image Super Resolution

Feb 28, 2025

Xue Yang, Tao Chen, Lei Guo, Wenbo Jiang, Ji Guo, Yongming Li, Jiaming He

Abstract:Reference-based image super-resolution (RefSR) represents a promising advancement in super-resolution (SR). In contrast to single-image super-resolution (SISR), RefSR leverages an additional reference image to help recover high-frequency details, yet its vulnerability to backdoor attacks has not been explored. To fill this research gap, we propose a novel attack framework called BadRefSR, which embeds backdoors in the RefSR model by adding triggers to the reference images and training with a mixed loss function. Extensive experiments across various backdoor attack settings demonstrate the effectiveness of BadRefSR. The compromised RefSR network performs normally on clean input images, while outputting attacker-specified target images on triggered input images. Our study aims to alert researchers to the potential backdoor risks in RefSR. Codes are available at https://github.com/xuefusiji/BadRefSR.

* 5 pages,4 figures

Via

Access Paper or Ask Questions

DivTrackee versus DynTracker: Promoting Diversity in Anti-Facial Recognition against Dynamic FR Strategy

Jan 11, 2025

Wenshu Fan, Minxing Zhang, Hongwei Li, Wenbo Jiang, Hanxiao Chen, Xiangyu Yue, Michael Backes, Xiao Zhang

Figure 1 for DivTrackee versus DynTracker: Promoting Diversity in Anti-Facial Recognition against Dynamic FR Strategy

Figure 2 for DivTrackee versus DynTracker: Promoting Diversity in Anti-Facial Recognition against Dynamic FR Strategy

Figure 3 for DivTrackee versus DynTracker: Promoting Diversity in Anti-Facial Recognition against Dynamic FR Strategy

Figure 4 for DivTrackee versus DynTracker: Promoting Diversity in Anti-Facial Recognition against Dynamic FR Strategy

Abstract:The widespread adoption of facial recognition (FR) models raises serious concerns about their potential misuse, motivating the development of anti-facial recognition (AFR) to protect user facial privacy. In this paper, we argue that the static FR strategy, predominantly adopted in prior literature for evaluating AFR efficacy, cannot faithfully characterize the actual capabilities of determined trackers who aim to track a specific target identity. In particular, we introduce \emph{\ourAttack}, a dynamic FR strategy where the model's gallery database is iteratively updated with newly recognized target identity images. Surprisingly, such a simple approach renders all the existing AFR protections ineffective. To mitigate the privacy threats posed by DynTracker, we advocate for explicitly promoting diversity in the AFR-protected images. We hypothesize that the lack of diversity is the primary cause of the failure of existing AFR methods. Specifically, we develop \emph{DivTrackee}, a novel method for crafting diverse AFR protections that builds upon a text-guided image generation framework and diversity-promoting adversarial losses. Through comprehensive experiments on various facial image benchmarks and feature extractors, we demonstrate DynTracker's strength in breaking existing AFR methods and the superiority of DivTrackee in preventing user facial images from being identified by dynamic FR strategies. We believe our work can act as an important initial step towards developing more effective AFR methods for protecting user facial privacy against determined trackers.

Via

Access Paper or Ask Questions

Stealthy and Robust Backdoor Attack against 3D Point Clouds through Additional Point Features

Dec 10, 2024

Xiaoyang Ning, Qing Xie, Jinyu Xu, Wenbo Jiang, Jiachen Li, Yanchun Ma

Abstract:Recently, 3D backdoor attacks have posed a substantial threat to 3D Deep Neural Networks (3D DNNs) designed for 3D point clouds, which are extensively deployed in various security-critical applications. Although the existing 3D backdoor attacks achieved high attack performance, they remain vulnerable to preprocessing-based defenses (e.g., outlier removal and rotation augmentation) and are prone to detection by human inspection. In pursuit of a more challenging-to-defend and stealthy 3D backdoor attack, this paper introduces the Stealthy and Robust Backdoor Attack (SRBA), which ensures robustness and stealthiness through intentional design considerations. The key insight of our attack involves applying a uniform shift to the additional point features of point clouds (e.g., reflection intensity) widely utilized as part of inputs for 3D DNNs as the trigger. Without altering the geometric information of the point clouds, our attack ensures visual consistency between poisoned and benign samples, and demonstrate robustness against preprocessing-based defenses. In addition, to automate our attack, we employ Bayesian Optimization (BO) to identify the suitable trigger. Extensive experiments suggest that SRBA achieves an attack success rate (ASR) exceeding 94% in all cases, and significantly outperforms previous SOTA methods when multiple preprocessing operations are applied during training.

Via

Access Paper or Ask Questions

One Prompt to Verify Your Models: Black-Box Text-to-Image Models Verification via Non-Transferable Adversarial Attacks

Oct 31, 2024

Ji Guo, Wenbo Jiang, Rui Zhang, Guoming Lu, Hongwei Li

Figure 1 for One Prompt to Verify Your Models: Black-Box Text-to-Image Models Verification via Non-Transferable Adversarial Attacks

Figure 2 for One Prompt to Verify Your Models: Black-Box Text-to-Image Models Verification via Non-Transferable Adversarial Attacks

Figure 3 for One Prompt to Verify Your Models: Black-Box Text-to-Image Models Verification via Non-Transferable Adversarial Attacks

Figure 4 for One Prompt to Verify Your Models: Black-Box Text-to-Image Models Verification via Non-Transferable Adversarial Attacks

Abstract:Recently, the success of Text-to-Image (T2I) models has led to the rise of numerous third-party platforms, which claim to provide cheaper API services and more flexibility in model options. However, this also raises a new security concern: Are these third-party services truly offering the models they claim? To address this problem, we propose the first T2I model verification method named Text-to-Image Model Verification via Non-Transferable Adversarial Attacks (TVN). The non-transferability of adversarial examples means that these examples are only effective on a target model and ineffective on other models, thereby allowing for the verification of the target model. TVN utilizes the Non-dominated Sorting Genetic Algorithm II (NSGA-II) to optimize the cosine similarity of a prompt's text encoding, generating non-transferable adversarial prompts. By calculating the CLIP-text scores between the non-transferable adversarial prompts without perturbations and the images, we can verify if the model matches the claimed target model, based on a 3-sigma threshold. The experiments showed that TVN performed well in both closed-set and open-set scenarios, achieving a verification accuracy of over 90\%. Moreover, the adversarial prompts generated by TVN significantly reduced the CLIP-text scores of the target model, while having little effect on other models.

Via

Access Paper or Ask Questions

Backdoor Attack Against Vision Transformers via Attention Gradient-Based Image Erosion

Oct 30, 2024

Ji Guo, Hongwei Li, Wenbo Jiang, Guoming Lu

Abstract:Vision Transformers (ViTs) have outperformed traditional Convolutional Neural Networks (CNN) across various computer vision tasks. However, akin to CNN, ViTs are vulnerable to backdoor attacks, where the adversary embeds the backdoor into the victim model, causing it to make wrong predictions about testing samples containing a specific trigger. Existing backdoor attacks against ViTs have the limitation of failing to strike an optimal balance between attack stealthiness and attack effectiveness. In this work, we propose an Attention Gradient-based Erosion Backdoor (AGEB) targeted at ViTs. Considering the attention mechanism of ViTs, AGEB selectively erodes pixels in areas of maximal attention gradient, embedding a covert backdoor trigger. Unlike previous backdoor attacks against ViTs, AGEB achieves an optimal balance between attack stealthiness and attack effectiveness, ensuring the trigger remains invisible to human detection while preserving the model's accuracy on clean samples. Extensive experimental evaluations across various ViT architectures and datasets confirm the effectiveness of AGEB, achieving a remarkable Attack Success Rate (ASR) without diminishing Clean Data Accuracy (CDA). Furthermore, the stealthiness of AGEB is rigorously validated, demonstrating minimal visual discrepancies between the clean and the triggered images.

* Accepted by IEEE GLOBECOM 2024

Via

Access Paper or Ask Questions

DDFAD: Dataset Distillation Framework for Audio Data

Jul 15, 2024

Wenbo Jiang, Rui Zhang, Hongwei Li, Xiaoyuan Liu, Haomiao Yang, Shui Yu

Figure 1 for DDFAD: Dataset Distillation Framework for Audio Data

Figure 2 for DDFAD: Dataset Distillation Framework for Audio Data

Figure 3 for DDFAD: Dataset Distillation Framework for Audio Data

Figure 4 for DDFAD: Dataset Distillation Framework for Audio Data

Abstract:Deep neural networks (DNNs) have achieved significant success in numerous applications. The remarkable performance of DNNs is largely attributed to the availability of massive, high-quality training datasets. However, processing such massive training data requires huge computational and storage resources. Dataset distillation is a promising solution to this problem, offering the capability to compress a large dataset into a smaller distilled dataset. The model trained on the distilled dataset can achieve comparable performance to the model trained on the whole dataset. While dataset distillation has been demonstrated in image data, none have explored dataset distillation for audio data. In this work, for the first time, we propose a Dataset Distillation Framework for Audio Data (DDFAD). Specifically, we first propose the Fused Differential MFCC (FD-MFCC) as extracted features for audio data. After that, the FD-MFCC is distilled through the matching training trajectory distillation method. Finally, we propose an audio signal reconstruction algorithm based on the Griffin-Lim Algorithm to reconstruct the audio signal from the distilled FD-MFCC. Extensive experiments demonstrate the effectiveness of DDFAD on various audio datasets. In addition, we show that DDFAD has promising application prospects in many applications, such as continual learning and neural architecture search.

Via

Access Paper or Ask Questions

Backdoor Attacks against Image-to-Image Networks

Jul 15, 2024

Wenbo Jiang, Hongwei Li, Jiaming He, Rui Zhang, Guowen Xu, Tianwei Zhang, Rongxing Lu

Figure 1 for Backdoor Attacks against Image-to-Image Networks

Figure 2 for Backdoor Attacks against Image-to-Image Networks

Figure 3 for Backdoor Attacks against Image-to-Image Networks

Figure 4 for Backdoor Attacks against Image-to-Image Networks

Abstract:Recently, deep learning-based Image-to-Image (I2I) networks have become the predominant choice for I2I tasks such as image super-resolution and denoising. Despite their remarkable performance, the backdoor vulnerability of I2I networks has not been explored. To fill this research gap, we conduct a comprehensive investigation on the susceptibility of I2I networks to backdoor attacks. Specifically, we propose a novel backdoor attack technique, where the compromised I2I network behaves normally on clean input images, yet outputs a predefined image of the adversary for malicious input images containing the trigger. To achieve this I2I backdoor attack, we propose a targeted universal adversarial perturbation (UAP) generation algorithm for I2I networks, where the generated UAP is used as the backdoor trigger. Additionally, in the backdoor training process that contains the main task and the backdoor task, multi-task learning (MTL) with dynamic weighting methods is employed to accelerate convergence rates. In addition to attacking I2I tasks, we extend our I2I backdoor to attack downstream tasks, including image classification and object detection. Extensive experiments demonstrate the effectiveness of the I2I backdoor on state-of-the-art I2I network architectures, as well as the robustness against different mainstream backdoor defenses.

Via

Access Paper or Ask Questions

Talk Too Much: Poisoning Large Language Models under Token Limit

Apr 24, 2024

Jiaming He, Wenbo Jiang, Guanyu Hou, Wenshu Fan, Rui Zhang, Hongwei Li

Abstract:Mainstream poisoning attacks on large language models (LLMs) typically set a fixed trigger in the input instance and specific responses for triggered queries. However, the fixed trigger setting (e.g., unusual words) may be easily detected by human detection, limiting the effectiveness and practicality in real-world scenarios. To enhance the stealthiness of the trigger, we present a poisoning attack against LLMs that is triggered by a generation/output condition-token limitation, which is a commonly adopted strategy by users for reducing costs. The poisoned model performs normally for output without token limitation, while becomes harmful for output with limited tokens. To achieve this objective, we introduce BrieFool, an efficient attack framework. It leverages the characteristics of generation limitation by efficient instruction sampling and poisoning data generation, thereby influencing the behavior of LLMs under target conditions. Our experiments demonstrate that BrieFool is effective across safety domains and knowledge domains. For instance, with only 20 generated poisoning examples against GPT-3.5-turbo, BrieFool achieves a 100% Attack Success Rate (ASR) and a 9.28/10 average Harmfulness Score (HS) under token limitation conditions while maintaining the benign performance.

Via

Access Paper or Ask Questions

Rapid Adoption, Hidden Risks: The Dual Impact of Large Language Model Customization

Feb 15, 2024

Rui Zhang, Hongwei Li, Rui Wen, Wenbo Jiang, Yuan Zhang, Michael Backes, Yun Shen, Yang Zhang

Abstract:The increasing demand for customized Large Language Models (LLMs) has led to the development of solutions like GPTs. These solutions facilitate tailored LLM creation via natural language prompts without coding. However, the trustworthiness of third-party custom versions of LLMs remains an essential concern. In this paper, we propose the first instruction backdoor attacks against applications integrated with untrusted customized LLMs (e.g., GPTs). Specifically, these attacks embed the backdoor into the custom version of LLMs by designing prompts with backdoor instructions, outputting the attacker's desired result when inputs contain the pre-defined triggers. Our attack includes 3 levels of attacks: word-level, syntax-level, and semantic-level, which adopt different types of triggers with progressive stealthiness. We stress that our attacks do not require fine-tuning or any modification to the backend LLMs, adhering strictly to GPTs development guidelines. We conduct extensive experiments on 4 prominent LLMs and 5 benchmark text classification datasets. The results show that our instruction backdoor attacks achieve the desired attack performance without compromising utility. Additionally, we propose an instruction-ignoring defense mechanism and demonstrate its partial effectiveness in mitigating such attacks. Our findings highlight the vulnerability and the potential risks of LLM customization such as GPTs.

Via

Access Paper or Ask Questions