Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shaohui Mei

Revealing the Intrinsic Ethical Vulnerability of Aligned Large Language Models

Apr 07, 2025

Jiawei Lian, Jianhong Pan, Lefan Wang, Yi Wang, Shaohui Mei, Lap-Pui Chau

Abstract:Large language models (LLMs) are foundational explorations to artificial general intelligence, yet their alignment with human values via instruction tuning and preference learning achieves only superficial compliance. Here, we demonstrate that harmful knowledge embedded during pretraining persists as indelible "dark patterns" in LLMs' parametric memory, evading alignment safeguards and resurfacing under adversarial inducement at distributional shifts. In this study, we first theoretically analyze the intrinsic ethical vulnerability of aligned LLMs by proving that current alignment methods yield only local "safety regions" in the knowledge manifold. In contrast, pretrained knowledge remains globally connected to harmful concepts via high-likelihood adversarial trajectories. Building on this theoretical insight, we empirically validate our findings by employing semantic coherence inducement under distributional shifts--a method that systematically bypasses alignment constraints through optimized adversarial prompts. This combined theoretical and empirical approach achieves a 100% attack success rate across 19 out of 23 state-of-the-art aligned LLMs, including DeepSeek-R1 and LLaMA-3, revealing their universal vulnerabilities.

Via

Access Paper or Ask Questions

PADetBench: Towards Benchmarking Physical Attacks against Object Detection

Aug 17, 2024

Jiawei Lian, Jianhong Pan, Lefan Wang, Yi Wang, Lap-Pui Chau, Shaohui Mei

Figure 1 for PADetBench: Towards Benchmarking Physical Attacks against Object Detection

Figure 2 for PADetBench: Towards Benchmarking Physical Attacks against Object Detection

Figure 3 for PADetBench: Towards Benchmarking Physical Attacks against Object Detection

Figure 4 for PADetBench: Towards Benchmarking Physical Attacks against Object Detection

Abstract:Physical attacks against object detection have gained increasing attention due to their significant practical implications. However, conducting physical experiments is extremely time-consuming and labor-intensive. Moreover, physical dynamics and cross-domain transformation are challenging to strictly regulate in the real world, leading to unaligned evaluation and comparison, severely hindering the development of physically robust models. To accommodate these challenges, we explore utilizing realistic simulation to thoroughly and rigorously benchmark physical attacks with fairness under controlled physical dynamics and cross-domain transformation. This resolves the problem of capturing identical adversarial images that cannot be achieved in the real world. Our benchmark includes 20 physical attack methods, 48 object detectors, comprehensive physical dynamics, and evaluation metrics. We also provide end-to-end pipelines for dataset generation, detection, evaluation, and further analysis. In addition, we perform 8064 groups of evaluation based on our benchmark, which includes both overall evaluation and further detailed ablation studies for controlled physical dynamics. Through these experiments, we provide in-depth analyses of physical attack performance and physical adversarial robustness, draw valuable observations, and discuss potential directions for future research. Codebase: https://github.com/JiaweiLian/Benchmarking_Physical_Attack

Via

Access Paper or Ask Questions

A Comprehensive Study on the Robustness of Image Classification and Object Detection in Remote Sensing: Surveying and Benchmarking

Jun 21, 2023

Shaohui Mei, Jiawei Lian, Xiaofei Wang, Yuru Su, Mingyang Ma, Lap-Pui Chau

Figure 1 for A Comprehensive Study on the Robustness of Image Classification and Object Detection in Remote Sensing: Surveying and Benchmarking

Figure 2 for A Comprehensive Study on the Robustness of Image Classification and Object Detection in Remote Sensing: Surveying and Benchmarking

Figure 3 for A Comprehensive Study on the Robustness of Image Classification and Object Detection in Remote Sensing: Surveying and Benchmarking

Figure 4 for A Comprehensive Study on the Robustness of Image Classification and Object Detection in Remote Sensing: Surveying and Benchmarking

Abstract:Deep neural networks (DNNs) have found widespread applications in interpreting remote sensing (RS) imagery. However, it has been demonstrated in previous works that DNNs are vulnerable to different types of noises, particularly adversarial noises. Surprisingly, there has been a lack of comprehensive studies on the robustness of RS tasks, prompting us to undertake a thorough survey and benchmark on the robustness of image classification and object detection in RS. To our best knowledge, this study represents the first comprehensive examination of both natural robustness and adversarial robustness in RS tasks. Specifically, we have curated and made publicly available datasets that contain natural and adversarial noises. These datasets serve as valuable resources for evaluating the robustness of DNNs-based models. To provide a comprehensive assessment of model robustness, we conducted meticulous experiments with numerous different classifiers and detectors, encompassing a wide range of mainstream methods. Through rigorous evaluation, we have uncovered insightful and intriguing findings, which shed light on the relationship between adversarial noise crafting and model training, yielding a deeper understanding of the susceptibility and limitations of various models, and providing guidance for the development of more resilient and robust models

Via

Access Paper or Ask Questions

CBA: Contextual Background Attack against Optical Aerial Detection in the Physical World

Mar 20, 2023

Jiawei Lian, Xiaofei Wang, Yuru Su, Mingyang Ma, Shaohui Mei

Figure 1 for CBA: Contextual Background Attack against Optical Aerial Detection in the Physical World

Figure 2 for CBA: Contextual Background Attack against Optical Aerial Detection in the Physical World

Figure 3 for CBA: Contextual Background Attack against Optical Aerial Detection in the Physical World

Figure 4 for CBA: Contextual Background Attack against Optical Aerial Detection in the Physical World

Abstract:Patch-based physical attacks have increasingly aroused concerns. However, most existing methods focus on obscuring targets captured on the ground, and some of these methods are simply extended to deceive aerial detectors. They smear the targeted objects in the physical world with the elaborated adversarial patches, which can only slightly sway the aerial detectors' prediction and with weak attack transferability. To address the above issues, we propose to perform Contextual Background Attack (CBA), a novel physical attack framework against aerial detection, which can achieve strong attack efficacy and transferability in the physical world even without smudging the interested objects at all. Specifically, the targets of interest, i.e. the aircraft in aerial images, are adopted to mask adversarial patches. The pixels outside the mask area are optimized to make the generated adversarial patches closely cover the critical contextual background area for detection, which contributes to gifting adversarial patches with more robust and transferable attack potency in the real world. To further strengthen the attack performance, the adversarial patches are forced to be outside targets during training, by which the detected objects of interest, both on and outside patches, benefit the accumulation of attack efficacy. Consequently, the sophisticatedly designed patches are gifted with solid fooling efficacy against objects both on and outside the adversarial patches simultaneously. Extensive proportionally scaled experiments are performed in physical scenarios, demonstrating the superiority and potential of the proposed framework for physical attacks. We expect that the proposed physical attack method will serve as a benchmark for assessing the adversarial robustness of diverse aerial detectors and defense methods.

Via

Access Paper or Ask Questions

Contextual adversarial attack against aerial detection in the physical world

Feb 27, 2023

Jiawei Lian, Xiaofei Wang, Yuru Su, Mingyang Ma, Shaohui Mei

Figure 1 for Contextual adversarial attack against aerial detection in the physical world

Figure 2 for Contextual adversarial attack against aerial detection in the physical world

Figure 3 for Contextual adversarial attack against aerial detection in the physical world

Figure 4 for Contextual adversarial attack against aerial detection in the physical world

Abstract:Deep Neural Networks (DNNs) have been extensively utilized in aerial detection. However, DNNs' sensitivity and vulnerability to maliciously elaborated adversarial examples have progressively garnered attention. Recently, physical attacks have gradually become a hot issue due to they are more practical in the real world, which poses great threats to some security-critical applications. In this paper, we take the first attempt to perform physical attacks in contextual form against aerial detection in the physical world. We propose an innovative contextual attack method against aerial detection in real scenarios, which achieves powerful attack performance and transfers well between various aerial object detectors without smearing or blocking the interested objects to hide. Based on the findings that the targets' contextual information plays an important role in aerial detection by observing the detectors' attention maps, we propose to make full use of the contextual area of the interested targets to elaborate contextual perturbations for the uncovered attacks in real scenarios. Extensive proportionally scaled experiments are conducted to evaluate the effectiveness of the proposed contextual attack method, which demonstrates the proposed method's superiority in both attack efficacy and physical practicality.

Via

Access Paper or Ask Questions

Benchmarking Adversarial Patch Against Aerial Detection

Oct 30, 2022

Jiawei Lian, Shaohui Mei, Shun Zhang, Mingyang Ma

Figure 1 for Benchmarking Adversarial Patch Against Aerial Detection

Figure 2 for Benchmarking Adversarial Patch Against Aerial Detection

Figure 3 for Benchmarking Adversarial Patch Against Aerial Detection

Figure 4 for Benchmarking Adversarial Patch Against Aerial Detection

Abstract:DNNs are vulnerable to adversarial examples, which poses great security concerns for security-critical systems. In this paper, a novel adaptive-patch-based physical attack (AP-PA) framework is proposed, which aims to generate adversarial patches that are adaptive in both physical dynamics and varying scales, and by which the particular targets can be hidden from being detected. Furthermore, the adversarial patch is also gifted with attack effectiveness against all targets of the same class with a patch outside the target (No need to smear targeted objects) and robust enough in the physical world. In addition, a new loss is devised to consider more available information of detected objects to optimize the adversarial patch, which can significantly improve the patch's attack efficacy (Average precision drop up to 87.86% and 85.48% in white-box and black-box settings, respectively) and optimizing efficiency. We also establish one of the first comprehensive, coherent, and rigorous benchmarks to evaluate the attack efficacy of adversarial patches on aerial detection tasks. Finally, several proportionally scaled experiments are performed physically to demonstrate that the elaborated adversarial patches can successfully deceive aerial detection algorithms in dynamic physical circumstances. The code is available at https://github.com/JiaweiLian/AP-PA.

* 14 pages, 14 figures

Via

Access Paper or Ask Questions

Multi-level Adversarial Spatio-temporal Learning for Footstep Pressure based FoG Detection

Sep 22, 2022

Kun Hu, Shaohui Mei, Wei Wang, Kaylena A. Ehgoetz Martens, Liang Wang, Simon J. G. Lewis, David D. Feng, Zhiyong Wang

Figure 1 for Multi-level Adversarial Spatio-temporal Learning for Footstep Pressure based FoG Detection

Figure 2 for Multi-level Adversarial Spatio-temporal Learning for Footstep Pressure based FoG Detection

Figure 3 for Multi-level Adversarial Spatio-temporal Learning for Footstep Pressure based FoG Detection

Figure 4 for Multi-level Adversarial Spatio-temporal Learning for Footstep Pressure based FoG Detection

Abstract:Freezing of gait (FoG) is one of the most common symptoms of Parkinson's disease, which is a neurodegenerative disorder of the central nervous system impacting millions of people around the world. To address the pressing need to improve the quality of treatment for FoG, devising a computer-aided detection and quantification tool for FoG has been increasingly important. As a non-invasive technique for collecting motion patterns, the footstep pressure sequences obtained from pressure sensitive gait mats provide a great opportunity for evaluating FoG in the clinic and potentially in the home environment. In this study, FoG detection is formulated as a sequential modelling task and a novel deep learning architecture, namely Adversarial Spatio-temporal Network (ASTN), is proposed to learn FoG patterns across multiple levels. A novel adversarial training scheme is introduced with a multi-level subject discriminator to obtain subject-independent FoG representations, which helps to reduce the over-fitting risk due to the high inter-subject variance. As a result, robust FoG detection can be achieved for unseen subjects. The proposed scheme also sheds light on improving subject-level clinical studies from other scenarios as it can be integrated with many existing deep architectures. To the best of our knowledge, this is one of the first studies of footstep pressure-based FoG detection and the approach of utilizing ASTN is the first deep neural network architecture in pursuit of subject-independent representations. Experimental results on 393 trials collected from 21 subjects demonstrate encouraging performance of the proposed ASTN for FoG detection with an AUC 0.85.

Via

Access Paper or Ask Questions

Spectral Variability Augmented Sparse Unmixing of Hyperspectral Images

Oct 21, 2021

Ge Zhang, Shaohui Mei, Mingyang Ma, Yan Feng, Qian Du

Figure 1 for Spectral Variability Augmented Sparse Unmixing of Hyperspectral Images

Figure 2 for Spectral Variability Augmented Sparse Unmixing of Hyperspectral Images

Figure 3 for Spectral Variability Augmented Sparse Unmixing of Hyperspectral Images

Figure 4 for Spectral Variability Augmented Sparse Unmixing of Hyperspectral Images

Abstract:Spectral unmixing (SU) expresses the mixed pixels existed in hyperspectral images as the product of endmember and abundance, which has been widely used in hyperspectral imagery analysis. However, the influence of light, acquisition conditions and the inherent properties of materials, results in that the identified endmembers can vary spectrally within a given image (construed as spectral variability). To address this issue, recent methods usually use a priori obtained spectral library to represent multiple characteristic spectra of the same object, but few of them extracted the spectral variability explicitly. In this paper, a spectral variability augmented sparse unmixing model (SVASU) is proposed, in which the spectral variability is extracted for the first time. The variable spectra are divided into two parts of intrinsic spectrum and spectral variability for spectral reconstruction, and modeled synchronously in the SU model adding the regular terms restricting the sparsity of abundance and the generalization of the variability coefficient. It is noted that the spectral variability library and the intrinsic spectral library are all constructed from the In-situ observed image. Experimental results over both synthetic and real-world data sets demonstrate that the augmented decomposition by spectral variability significantly improves the unmixing performance than the decomposition only by spectral library, as well as compared to state-of-the-art algorithms.

Via

Access Paper or Ask Questions

Superpixel-guided Discriminative Low-rank Representation of Hyperspectral Images for Classification

Aug 25, 2021

Shujun Yang, Junhui Hou, Yuheng Jia, Shaohui Mei, Qian Du

Figure 1 for Superpixel-guided Discriminative Low-rank Representation of Hyperspectral Images for Classification

Figure 2 for Superpixel-guided Discriminative Low-rank Representation of Hyperspectral Images for Classification

Figure 3 for Superpixel-guided Discriminative Low-rank Representation of Hyperspectral Images for Classification

Figure 4 for Superpixel-guided Discriminative Low-rank Representation of Hyperspectral Images for Classification

Abstract:In this paper, we propose a novel classification scheme for the remotely sensed hyperspectral image (HSI), namely SP-DLRR, by comprehensively exploring its unique characteristics, including the local spatial information and low-rankness. SP-DLRR is mainly composed of two modules, i.e., the classification-guided superpixel segmentation and the discriminative low-rank representation, which are iteratively conducted. Specifically, by utilizing the local spatial information and incorporating the predictions from a typical classifier, the first module segments pixels of an input HSI (or its restoration generated by the second module) into superpixels. According to the resulting superpixels, the pixels of the input HSI are then grouped into clusters and fed into our novel discriminative low-rank representation model with an effective numerical solution. Such a model is capable of increasing the intra-class similarity by suppressing the spectral variations locally while promoting the inter-class discriminability globally, leading to a restored HSI with more discriminative pixels. Experimental results on three benchmark datasets demonstrate the significant superiority of SP-DLRR over state-of-the-art methods, especially for the case with an extremely limited number of training pixels.

Via

Access Paper or Ask Questions