Abstract:Detecting surface anomalies of industrial materials poses a significant challenge within a myriad of industrial manufacturing processes. In recent times, various methodologies have emerged, capitalizing on the advantages of employing a network pre-trained on natural images for the extraction of representative features. Subsequently, these features are subjected to processing through a diverse range of techniques including memory banks, normalizing flow, and knowledge distillation, which have exhibited exceptional accuracy. This paper revisits approaches based on pre-trained features by introducing a novel method centered on target-specific embedding. To capture the most representative features of the texture under consideration, we employ a variant of a contrastive training procedure that incorporates both artificially generated defective samples and anomaly-free samples during training. Exploiting the intrinsic properties of surfaces, we derived a meaningful representation from the defect-free samples during training, facilitating a straightforward yet effective calculation of anomaly scores. The experiments conducted on the MVTEC AD and TILDA datasets demonstrate the competitiveness of our approach compared to state-of-the-art methods.
Abstract:Unsupervised texture anomaly detection has been a concerning topic in a vast amount of industrial processes. Patterned textures inspection, particularly in the context of fabric defect detection, is indeed a widely encountered use case. This task involves handling a diverse spectrum of colors and textile types, encompassing a wide range of fabrics. Given the extensive variability in colors, textures, and defect types, fabric defect detection poses a complex and challenging problem in the field of patterned textures inspection. In this article, we propose a knowledge distillation-based approach tailored specifically for addressing the challenge of unsupervised anomaly detection in textures resembling fabrics. Our method aims to redefine the recently introduced reverse distillation approach, which advocates for an encoder-decoder design to mitigate classifier bias and to prevent the student from reconstructing anomalies. In this study, we present a new reverse distillation technique for the specific task of fabric defect detection. Our approach involves a meticulous design selection that strategically highlights high-level features. To demonstrate the capabilities of our approach both in terms of performance and inference speed, we conducted a series of experiments on multiple texture datasets, including MVTEC AD, AITEX, and TILDA, alongside conducting experiments on a dataset acquired from a textile manufacturing facility. The main contributions of this paper are the following: a robust texture anomaly detector utilizing a reverse knowledge-distillation technique suitable for both anomaly detection and domain generalization and a novel dataset encompassing a diverse range of fabrics and defects.
Abstract:For a very long time, unsupervised learning for anomaly detection has been at the heart of image processing research and a stepping stone for high performance industrial automation process. With the emergence of CNN, several methods have been proposed such as Autoencoders, GAN, deep feature extraction, etc. In this paper, we propose a new method based on the promising concept of knowledge distillation which consists of training a network (the student) on normal samples while considering the output of a larger pretrained network (the teacher). The main contributions of this paper are twofold: First, a reduced student architecture with optimal layer selection is proposed, then a new Student-Teacher architecture with network bias reduction combining two teachers is proposed in order to jointly enhance the performance of anomaly detection and its localization accuracy. The proposed texture anomaly detector has an outstanding capability to detect defects in any texture and a fast inference time compared to the SOTA methods.
Abstract:Unsupervised anomaly in industry has been a concerning topic and a stepping stone for high performance industrial automation process. The vast majority of industry-oriented methods focus on learning from good samples to detect anomaly notwithstanding some specific industrial scenario requiring even less specific training and therefore a generalization for anomaly detection. The obvious use case is the fabric anomaly detection, where we have to deal with a really wide range of colors and types of textile and a stoppage of the production line for training could not be considered. In this paper, we propose an automation process for industrial fabric texture defect detection with a specificity-learning process during the domain-generalized anomaly detection. Combining the ability to generalize and the learning process offer a fast and precise anomaly detection and segmentation. The main contributions of this paper are the following: A domain-generalization texture anomaly detection method achieving the state-of-the-art performances, a fast specific training on good samples extracted by the proposed method, a self-evaluation method based on custom defect creation and an automatic detection of already seen fabric to prevent re-training.
Abstract:Energy-based latent variable models (EBLVMs) are more expressive than conventional energy-based models. However, its potential on visual tasks are limited by its training process based on maximum likelihood estimate that requires sampling from two intractable distributions. In this paper, we propose Bi-level doubly variational learning (BiDVL), which is based on a new bi-level optimization framework and two tractable variational distributions to facilitate learning EBLVMs. Particularly, we lead a decoupled EBLVM consisting of a marginal energy-based distribution and a structural posterior to handle the difficulties when learning deep EBLVMs on images. By choosing a symmetric KL divergence in the lower level of our framework, a compact BiDVL for visual tasks can be obtained. Our model achieves impressive image generation performance over related works. It also demonstrates the significant capacity of testing image reconstruction and out-of-distribution detection.
Abstract:In this paper, we propose a novel method for irregularity detection. Previous researches solve this problem as a One-Class Classification (OCC) task where they train a reference model on all of the available samples. Then, they consider a test sample as an anomaly if it has a diversion from the reference model. Generative Adversarial Networks (GANs) have achieved the most promising results for OCC while implementing and training such networks, especially for the OCC task, is a cumbersome and computationally expensive procedure. To cope with the mentioned challenges, we present a simple but effective method to solve the irregularity detection as a binary classification task in order to make the implementation easier along with improving the detection performance. We learn two deep neural networks (generator and discriminator) in a GAN-style setting on merely the normal samples. During training, the generator gradually becomes an expert to generate samples which are similar to the normal ones. In the training phase, when the generator fails to produce normal data (in the early stages of learning and also prior to the complete convergence), it can be considered as an irregularity generator. In this way, we simultaneously generate the irregular samples. Afterward, we train a binary classifier on the generated anomalous samples along with the normal instances in order to be capable of detecting irregularities. The proposed framework applies to different related applications of outlier and anomaly detection in images and videos, respectively. The results confirm that our proposed method is superior to the baseline and state-of-the-art solutions.
Abstract:Temporal action proposal generation, coming from temporal action recognition, is an important and challenging problem in computer vision. Because of the big capacity of video files, the speed of temporal action recognition is difficult for both researchers and companies. To training a convolutional neural network (CNN) for temporal action recognition, a lot of videos are required to put into the CNN. A speed-up for the task should be proposed for the training process to achieve the faster response of temporal action recognition system. To address it, we implement ring parallel architecture by Massage Passing Interface (MPI). Different from traditional parameter server architecture, total data transmission is reduced by adding a connection between multiple computing load in our new architecture. Compared to parameter server architecture, our parallel architecture has higher efficiency on temporal action proposal generation task with multiple GPUs, which is significant to dealing with large-scale video database. And based on the absence of evaluating time consumption in a distributed deep learning system, we proposed a concept of training time metrics which can assess the performance in the distributed training process.
Abstract:It is challenging to detect the anomaly in crowded scenes for quite a long time. In this paper, a self-supervised framework, abnormal event detection network (AED-Net), which is composed of PCAnet and kernel principal component analysis (kPCA), is proposed to address this problem. Using surveillance video sequences of different scenes as raw data, PCAnet is trained to extract high-level semantics of crowd's situation. Next, kPCA,a one-class classifier, is trained to determine anomaly of the scene. In contrast to some prevailing deep learning methods,the framework is completely self-supervised because it utilizes only video sequences in a normal situation. Experiments of global and local abnormal event detection are carried out on UMN and UCSD datasets, and competitive results with higher EER and AUC compared to other state-of-the-art methods are observed. Furthermore, by adding local response normalization (LRN) layer, we propose an improvement to original AED-Net. And it is proved to perform better by promoting the framework's generalization capacity according to the experiments.
Abstract:As an important research topic in computer vision, fine-grained classification which aims to recognition subordinate-level categories has attracted significant attention. We propose a novel region based ensemble learning network for fine-grained classification. Our approach contains a detection module and a module for classification. The detection module is based on the faster R-CNN framework to locate the semantic regions of the object. The classification module using an ensemble learning method, which trains a set of sub-classifiers for different semantic regions and combines them together to get a stronger classifier. In the evaluation, we implement experiments on the CUB-2011 dataset and the result of experiments proves our method s efficient for fine-grained classification. We also extend our approach to remote scene recognition and evaluate it on the NWPU-RESISC45 dataset.
Abstract:In the real-life environments, due to the sudden appearance of windows, lights, and objects blocking the light source, the visual SLAM system can easily capture the low-contrast images caused by over-exposure or over-darkness. At this time, the direct method of estimating camera motion based on pixel luminance information is infeasible, and it is often difficult to find enough valid feature points without image processing. This paper proposed HE-SLAM, a new method combining histogram equalization and ORB feature extraction, which can be robust in more scenes, especially in stages with low-contrast images. Because HE-SLAM uses histogram equalization to improve the contrast of images, it can extract enough valid feature points in low-contrast images for subsequent feature matching, keyframe selection, bundle adjustment, and loop closure detection. The proposed HE-SLAM has been tested on the popular datasets (such as KITTI and EuRoc), and the real-time performance and robustness of the system are demonstrated by comparing system runtime and the mean square root error (RMSE) of absolute trajectory error (ATE) with state-of-the-art methods like ORB-SLAM2.