Abstract:Anomaly detection (AD) is a fundamental task in computer vision. It aims to identify incorrect image data patterns which deviate from the normal ones. Conventional methods generally address AD by preparing augmented negative samples to enforce self-supervised learning. However, these techniques typically do not consider semantics during augmentation, leading to the generation of unrealistic or invalid negative samples. Consequently, the feature extraction network can be hindered from embedding critical features. In this study, inspired by visual attention learning approaches, we propose CutSwap, which leverages saliency guidance to incorporate semantic cues for augmentation. Specifically, we first employ LayerCAM to extract multilevel image features as saliency maps and then perform clustering to obtain multiple centroids. To fully exploit saliency guidance, on each map, we select a pixel pair from the cluster with the highest centroid saliency to form a patch pair. Such a patch pair includes highly similar context information with dense semantic correlations. The resulting negative sample is created by swapping the locations of the patch pair. Compared to prior augmentation methods, CutSwap generates more subtle yet realistic negative samples to facilitate quality feature learning. Extensive experimental and ablative evaluations demonstrate that our method achieves state-of-the-art AD performance on two mainstream AD benchmark datasets.
Abstract:We present PD-REAL, a novel large-scale dataset for unsupervised anomaly detection (AD) in the 3D domain. It is motivated by the fact that 2D-only representations in the AD task may fail to capture the geometric structures of anomalies due to uncertainty in lighting conditions or shooting angles. PD-REAL consists entirely of Play-Doh models for 15 object categories and focuses on the analysis of potential benefits from 3D information in a controlled environment. Specifically, objects are first created with six types of anomalies, such as dent, crack, or perforation, and then photographed under different lighting conditions to mimic real-world inspection scenarios. To demonstrate the usefulness of 3D information, we use a commercially available RealSense camera to capture RGB and depth images. Compared to the existing 3D dataset for AD tasks, the data acquisition of PD-REAL is significantly cheaper, easily scalable and easier to control variables. Extensive evaluations with state-of-the-art AD algorithms on our dataset demonstrate the benefits as well as challenges of using 3D information. Our dataset can be downloaded from https://github.com/Andy-cs008/PD-REAL
Abstract:Anomaly detection, which is a critical and popular topic in computer vision, aims to detect anomalous samples that are different from the normal (i.e., non-anomalous) ones. The current mainstream methods focus on anomaly detection for images, whereas little attention has been paid to 3D point cloud. In this paper, drawing inspiration from the knowledge transfer ability of teacher-student architecture and the impressive feature extraction capability of recent neural networks, we design a teacher-student structured model for 3D anomaly detection. Specifically, we use feature space alignment, dimension zoom, and max pooling to extract the features of the point cloud and then minimize a multi-scale loss between the feature vectors produced by the teacher and the student networks. Moreover, our method only requires very few normal samples to train the student network due to the teacher-student distillation mechanism. Once trained, the teacher-student network pair can be leveraged jointly to fulfill 3D point cloud anomaly detection based on the calculated anomaly score. For evaluation, we compare our method against the reconstruction-based method on the ShapeNet-Part dataset. The experimental results and ablation studies quantitatively and qualitatively confirm that our model can achieve higher performance compared with the state of the arts in 3D anomaly detection with very few training samples.