Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Anindya Sundar Das

Adaptive Deviation Learning for Visual Anomaly Detection with Data Contamination

Nov 14, 2024

Anindya Sundar Das, Guansong Pang, Monowar Bhuyan

Abstract:Visual anomaly detection targets to detect images that notably differ from normal pattern, and it has found extensive application in identifying defective parts within the manufacturing industry. These anomaly detection paradigms predominantly focus on training detection models using only clean, unlabeled normal samples, assuming an absence of contamination; a condition often unmet in real-world scenarios. The performance of these methods significantly depends on the quality of the data and usually decreases when exposed to noise. We introduce a systematic adaptive method that employs deviation learning to compute anomaly scores end-to-end while addressing data contamination by assigning relative importance to the weights of individual instances. In this approach, the anomaly scores for normal instances are designed to approximate scalar scores obtained from the known prior distribution. Meanwhile, anomaly scores for anomaly examples are adjusted to exhibit statistically significant deviations from these reference scores. Our approach incorporates a constrained optimization problem within the deviation learning framework to update instance weights, resolving this problem for each mini-batch. Comprehensive experiments on the MVTec and VisA benchmark datasets indicate that our proposed method surpasses competing techniques and exhibits both stability and robustness in the presence of data contamination.

* Accepted to IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2025)

Via

Access Paper or Ask Questions

Few-shot Anomaly Detection in Text with Deviation Learning

Aug 22, 2023

Anindya Sundar Das, Aravind Ajay, Sriparna Saha, Monowar Bhuyan

Abstract:Most current methods for detecting anomalies in text concentrate on constructing models solely relying on unlabeled data. These models operate on the presumption that no labeled anomalous examples are available, which prevents them from utilizing prior knowledge of anomalies that are typically present in small numbers in many real-world applications. Furthermore, these models prioritize learning feature embeddings rather than optimizing anomaly scores directly, which could lead to suboptimal anomaly scoring and inefficient use of data during the learning process. In this paper, we introduce FATE, a deep few-shot learning-based framework that leverages limited anomaly examples and learns anomaly scores explicitly in an end-to-end method using deviation learning. In this approach, the anomaly scores of normal examples are adjusted to closely resemble reference scores obtained from a prior distribution. Conversely, anomaly samples are forced to have anomalous scores that considerably deviate from the reference score in the upper tail of the prior. Additionally, our model is optimized to learn the distinct behavior of anomalies by utilizing a multi-head self-attention layer and multiple instance learning approaches. Comprehensive experiments on several benchmark datasets demonstrate that our proposed approach attains a new level of state-of-the-art performance.

* Accepted in ICONIP 2023

Via

Access Paper or Ask Questions

Self-Supervised Image-to-Text and Text-to-Image Synthesis

Dec 09, 2021

Anindya Sundar Das, Sriparna Saha

Figure 1 for Self-Supervised Image-to-Text and Text-to-Image Synthesis

Figure 2 for Self-Supervised Image-to-Text and Text-to-Image Synthesis

Figure 3 for Self-Supervised Image-to-Text and Text-to-Image Synthesis

Figure 4 for Self-Supervised Image-to-Text and Text-to-Image Synthesis

Abstract:A comprehensive understanding of vision and language and their interrelation are crucial to realize the underlying similarities and differences between these modalities and to learn more generalized, meaningful representations. In recent years, most of the works related to Text-to-Image synthesis and Image-to-Text generation, focused on supervised generative deep architectures to solve the problems, where very little interest was placed on learning the similarities between the embedding spaces across modalities. In this paper, we propose a novel self-supervised deep learning based approach towards learning the cross-modal embedding spaces; for both image to text and text to image generations. In our approach, we first obtain dense vector representations of images using StackGAN-based autoencoder model and also dense vector representations on sentence-level utilizing LSTM based text-autoencoder; then we study the mapping from embedding space of one modality to embedding space of the other modality utilizing GAN and maximum mean discrepancy based generative networks. We, also demonstrate that our model learns to generate textual description from image data as well as images from textual data both qualitatively and quantitatively.

* ICONIP 2021. Lecture Notes in Computer Science, vol 13111, pp 415-426. Springer, Cham
* ICONIP 2021 : The 28th International Conference on Neural Information Processing

Via

Access Paper or Ask Questions