Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuqi Cheng

Visual Anomaly Detection under Complex View-Illumination Interplay: A Large-Scale Benchmark

May 16, 2025

Yunkang Cao, Yuqi Cheng, Xiaohao Xu, Yiheng Zhang, Yihan Sun, Yuxiang Tan, Yuxin Zhang, Xiaonan Huang, Weiming Shen

Abstract:The practical deployment of Visual Anomaly Detection (VAD) systems is hindered by their sensitivity to real-world imaging variations, particularly the complex interplay between viewpoint and illumination which drastically alters defect visibility. Current benchmarks largely overlook this critical challenge. We introduce Multi-View Multi-Illumination Anomaly Detection (M2AD), a new large-scale benchmark comprising 119,880 high-resolution images designed explicitly to probe VAD robustness under such interacting conditions. By systematically capturing 999 specimens across 10 categories using 12 synchronized views and 10 illumination settings (120 configurations total), M2AD enables rigorous evaluation. We establish two evaluation protocols: M2AD-Synergy tests the ability to fuse information across diverse configurations, and M2AD-Invariant measures single-image robustness against realistic view-illumination effects. Our extensive benchmarking shows that state-of-the-art VAD methods struggle significantly on M2AD, demonstrating the profound challenge posed by view-illumination interplay. This benchmark serves as an essential tool for developing and validating VAD methods capable of overcoming real-world complexities. Our full dataset and test suite will be released at https://hustcyq.github.io/M2AD to facilitate the field.

* Homgepage: https://hustcyq.github.io/M2AD/. Yunkang Cao and Yuqi Cheng contribute equally to this work

Via

Access Paper or Ask Questions

Boosting Global-Local Feature Matching via Anomaly Synthesis for Multi-Class Point Cloud Anomaly Detection

May 12, 2025

Yuqi Cheng, Yunkang Cao, Dongfang Wang, Weiming Shen, Wenlong Li

Abstract:Point cloud anomaly detection is essential for various industrial applications. The huge computation and storage costs caused by the increasing product classes limit the application of single-class unsupervised methods, necessitating the development of multi-class unsupervised methods. However, the feature similarity between normal and anomalous points from different class data leads to the feature confusion problem, which greatly hinders the performance of multi-class methods. Therefore, we introduce a multi-class point cloud anomaly detection method, named GLFM, leveraging global-local feature matching to progressively separate data that are prone to confusion across multiple classes. Specifically, GLFM is structured into three stages: Stage-I proposes an anomaly synthesis pipeline that stretches point clouds to create abundant anomaly data that are utilized to adapt the point cloud feature extractor for better feature representation. Stage-II establishes the global and local memory banks according to the global and local feature distributions of all the training data, weakening the impact of feature confusion on the establishment of the memory bank. Stage-III implements anomaly detection of test data leveraging its feature distance from global and local memory banks. Extensive experiments on the MVTec 3D-AD, Real3D-AD and actual industry parts dataset showcase our proposed GLFM's superior point cloud anomaly detection performance. The code is available at https://github.com/hustCYQ/GLFM-Multi-class-3DAD.

* 12 pages, 12 figures

Via

Access Paper or Ask Questions

Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly Detection

Mar 04, 2025

Wei Luo, Yunkang Cao, Haiming Yao, Xiaotian Zhang, Jianan Lou, Yuqi Cheng, Weiming Shen, Wenyong Yu

Abstract:Anomaly detection (AD) is essential for industrial inspection, yet existing methods typically rely on ``comparing'' test images to normal references from a training set. However, variations in appearance and positioning often complicate the alignment of these references with the test image, limiting detection accuracy. We observe that most anomalies manifest as local variations, meaning that even within anomalous images, valuable normal information remains. We argue that this information is useful and may be more aligned with the anomalies since both the anomalies and the normal information originate from the same image. Therefore, rather than relying on external normality from the training set, we propose INP-Former, a novel method that extracts Intrinsic Normal Prototypes (INPs) directly from the test image. Specifically, we introduce the INP Extractor, which linearly combines normal tokens to represent INPs. We further propose an INP Coherence Loss to ensure INPs can faithfully represent normality for the testing image. These INPs then guide the INP-Guided Decoder to reconstruct only normal tokens, with reconstruction errors serving as anomaly scores. Additionally, we propose a Soft Mining Loss to prioritize hard-to-optimize samples during training. INP-Former achieves state-of-the-art performance in single-class, multi-class, and few-shot AD tasks across MVTec-AD, VisA, and Real-IAD, positioning it as a versatile and universal solution for AD. Remarkably, INP-Former also demonstrates some zero-shot AD capability. Code is available at:https://github.com/luow23/INP-Former.

* Accepted by CVPR2025

Via

Access Paper or Ask Questions

A Robust Probability-based Joint Registration Method of Multiple Point Clouds Considering Local Consistency

Sep 15, 2024

Lingjie Su, Wei Xu, Shuyang Zhao, Yuqi Cheng, Wenlong Li

Figure 1 for A Robust Probability-based Joint Registration Method of Multiple Point Clouds Considering Local Consistency

Figure 2 for A Robust Probability-based Joint Registration Method of Multiple Point Clouds Considering Local Consistency

Figure 3 for A Robust Probability-based Joint Registration Method of Multiple Point Clouds Considering Local Consistency

Figure 4 for A Robust Probability-based Joint Registration Method of Multiple Point Clouds Considering Local Consistency

Abstract:In robotic inspection, joint registration of multiple point clouds is an essential technique for estimating the transformation relationships between measured parts, such as multiple blades in a propeller. However, the presence of noise and outliers in the data can significantly impair the registration performance by affecting the correctness of correspondences. To address this issue, we incorporate local consistency property into the probability-based joint registration method. Specifically, each measured point set is treated as a sample from an unknown Gaussian Mixture Model (GMM), and the registration problem is framed as estimating the probability model. By incorporating local consistency into the optimization process, we enhance the robustness and accuracy of the posterior distributions, which represent the one-to-all correspondences that directly determine the registration results. Effective closed-form solution for transformation and probability parameters are derived with Expectation-Maximization (EM) algorithm. Extensive experiments demonstrate that our method outperforms the existing methods, achieving high accuracy and robustness with the existence of noise and outliers. The code will be available at https://github.com/sulingjie/JPRLC_registration.

* Submitted to ICRA 2025

Via

Access Paper or Ask Questions

AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection

Jul 22, 2024

Yunkang Cao, Jiangning Zhang, Luca Frittoli, Yuqi Cheng, Weiming Shen, Giacomo Boracchi

Figure 1 for AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection

Figure 2 for AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection

Figure 3 for AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection

Figure 4 for AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection

Abstract:Zero-shot anomaly detection (ZSAD) targets the identification of anomalies within images from arbitrary novel categories. This study introduces AdaCLIP for the ZSAD task, leveraging a pre-trained vision-language model (VLM), CLIP. AdaCLIP incorporates learnable prompts into CLIP and optimizes them through training on auxiliary annotated anomaly detection data. Two types of learnable prompts are proposed: static and dynamic. Static prompts are shared across all images, serving to preliminarily adapt CLIP for ZSAD. In contrast, dynamic prompts are generated for each test image, providing CLIP with dynamic adaptation capabilities. The combination of static and dynamic prompts is referred to as hybrid prompts, and yields enhanced ZSAD performance. Extensive experiments conducted across 14 real-world anomaly detection datasets from industrial and medical domains indicate that AdaCLIP outperforms other ZSAD methods and can generalize better to different categories and even domains. Finally, our analysis highlights the importance of diverse auxiliary data and optimized prompts for enhanced generalization capacity. Code is available at https://github.com/caoyunkang/AdaCLIP.

* Accepted by ECCV 2024

Via

Access Paper or Ask Questions

RAD: A Comprehensive Dataset for Benchmarking the Robustness of Image Anomaly Detection

Jun 11, 2024

Yuqi Cheng, Yunkang Cao, Rui Chen, Weiming Shen

Abstract:Robustness against noisy imaging is crucial for practical image anomaly detection systems. This study introduces a Robust Anomaly Detection (RAD) dataset with free views, uneven illuminations, and blurry collections to systematically evaluate the robustness of current anomaly detection methods. Specifically, RAD aims to identify foreign objects on working platforms as anomalies. The collection process incorporates various sources of imaging noise, such as viewpoint changes, uneven illuminations, and blurry collections, to replicate real-world inspection scenarios. Subsequently, we assess and analyze 11 state-of-the-art unsupervised and zero-shot methods on RAD. Our findings indicate that: 1) Variations in viewpoint, illumination, and blurring affect anomaly detection methods to varying degrees; 2) Methods relying on memory banks and assisted by synthetic anomalies demonstrate stronger robustness; 3) Effectively leveraging the general knowledge of foundational models is a promising avenue for enhancing the robustness of anomaly detection methods.

* 6 pages, 5 figures

Via

Access Paper or Ask Questions

A Survey on Visual Anomaly Detection: Challenge, Approach, and Prospect

Jan 29, 2024

Yunkang Cao, Xiaohao Xu, Jiangning Zhang, Yuqi Cheng, Xiaonan Huang, Guansong Pang, Weiming Shen

Abstract:Visual Anomaly Detection (VAD) endeavors to pinpoint deviations from the concept of normality in visual data, widely applied across diverse domains, e.g., industrial defect inspection, and medical lesion detection. This survey comprehensively examines recent advancements in VAD by identifying three primary challenges: 1) scarcity of training data, 2) diversity of visual modalities, and 3) complexity of hierarchical anomalies. Starting with a brief overview of the VAD background and its generic concept definitions, we progressively categorize, emphasize, and discuss the latest VAD progress from the perspective of sample number, data modality, and anomaly hierarchy. Through an in-depth analysis of the VAD field, we finally summarize future developments for VAD and conclude the key findings and contributions of this survey.

* Work in progress. Yunkang Cao, Xiaohao Xu, and Jiangning Zhang contribute equally to this work

Via

Access Paper or Ask Questions

Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection

Jun 15, 2023

Yunkang Cao, Xiaohao Xu, Chen Sun, Yuqi Cheng, Liang Gao, Weiming Shen

Abstract:This technical report introduces the winning solution of the team \textit{Segment Any Anomaly} for the CVPR2023 Visual Anomaly and Novelty Detection (VAND) challenge. Going beyond uni-modal prompt, \textit{e.g.}, language prompt, we present a novel framework, \textit{i.e.}, Segment Any Anomaly + (SAA$+$), for zero-shot anomaly segmentation with multi-modal prompts for the regularization of cascaded modern foundation models. Inspired by the great zero-shot generalization ability of foundation models like Segment Anything, we first explore their assembly (SAA) to leverage diverse multi-modal prior knowledge for anomaly localization. Subsequently, we further introduce multimodal prompts (SAA$+$) derived from domain expert knowledge and target image context to enable the non-parameter adaptation of foundation models to anomaly segmentation. The proposed SAA$+$ model achieves state-of-the-art performance on several anomaly segmentation benchmarks, including VisA and MVTec-AD, in the zero-shot setting. We will release the code of our winning solution for the CVPR2023 VAND challenge at \href{Segment-Any-Anomaly}{https://github.com/caoyunkang/Segment-Any-Anomaly} \footnote{The extended-version paper with more details is available at ~\cite{cao2023segment}.}

* The first two author contribute equally. CVPR workshop challenge report. arXiv admin note: substantial text overlap with arXiv:2305.10724

Via

Access Paper or Ask Questions

Segment Any Anomaly without Training via Hybrid Prompt Regularization

May 18, 2023

Yunkang Cao, Xiaohao Xu, Chen Sun, Yuqi Cheng, Zongwei Du, Liang Gao, Weiming Shen

Figure 1 for Segment Any Anomaly without Training via Hybrid Prompt Regularization

Figure 2 for Segment Any Anomaly without Training via Hybrid Prompt Regularization

Figure 3 for Segment Any Anomaly without Training via Hybrid Prompt Regularization

Figure 4 for Segment Any Anomaly without Training via Hybrid Prompt Regularization

Abstract:We present a novel framework, i.e., Segment Any Anomaly + (SAA+), for zero-shot anomaly segmentation with hybrid prompt regularization to improve the adaptability of modern foundation models. Existing anomaly segmentation models typically rely on domain-specific fine-tuning, limiting their generalization across countless anomaly patterns. In this work, inspired by the great zero-shot generalization ability of foundation models like Segment Anything, we first explore their assembly to leverage diverse multi-modal prior knowledge for anomaly localization. For non-parameter foundation model adaptation to anomaly segmentation, we further introduce hybrid prompts derived from domain expert knowledge and target image context as regularization. Our proposed SAA+ model achieves state-of-the-art performance on several anomaly segmentation benchmarks, including VisA, MVTec-AD, MTD, and KSDD2, in the zero-shot setting. We will release the code at \href{https://github.com/caoyunkang/Segment-Any-Anomaly}{https://github.com/caoyunkang/Segment-Any-Anomaly}.

* The first two authors contribute equally. The code will be available on https://github.com/caoyunkang/Segment-Any-Anomaly

Via

Access Paper or Ask Questions