Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shunsuke Sakai

Reconstruction-Free Anomaly Detection with Diffusion Models via Direct Latent Likelihood Evaluation

Apr 08, 2025

Shunsuke Sakai, Tatsuhito Hasegawa

Abstract:Diffusion models, with their robust distribution approximation capabilities, have demonstrated excellent performance in anomaly detection. However, conventional reconstruction-based approaches rely on computing the reconstruction error between the original and denoised images, which requires careful noise-strength tuning and over ten network evaluations per input-leading to significantly slower detection speeds. To address these limitations, we propose a novel diffusion-based anomaly detection method that circumvents the need for resource-intensive reconstruction. Instead of reconstructing the input image, we directly infer its corresponding latent variables and measure their density under the Gaussian prior distribution. Remarkably, the prior density proves effective as an anomaly score even when using a short partial diffusion process of only 2-5 steps. We evaluate our method on the MVTecAD dataset, achieving an AUC of 0.991 at 15 FPS, thereby setting a new state-of-the-art speed-AUC anomaly detection trade-off.

* Code is available at https://github.com/SkyShunsuke/InversionAD

Via

Access Paper or Ask Questions

Noisy Deep Ensemble: Accelerating Deep Ensemble Learning via Noise Injection

Apr 08, 2025

Shunsuke Sakai, Shunsuke Tsuge, Tatsuhito Hasegawa

Abstract:Neural network ensembles is a simple yet effective approach for enhancing generalization capabilities. The most common method involves independently training multiple neural networks initialized with different weights and then averaging their predictions during inference. However, this approach increases training time linearly with the number of ensemble members. To address this issue, we propose the novel ``\textbf{Noisy Deep Ensemble}'' method, significantly reducing the training time required for neural network ensembles. In this method, a \textit{parent model} is trained until convergence, and then the weights of the \textit{parent model} are perturbed in various ways to construct multiple \textit{child models}. This perturbation of the \textit{parent model} weights facilitates the exploration of different local minima while significantly reducing the training time for each ensemble member. We evaluated our method using diverse CNN architectures on CIFAR-10 and CIFAR-100 datasets, surpassing conventional efficient ensemble methods and achieving test accuracy comparable to standard ensembles. Code is available at \href{https://github.com/TSTB-dev/NoisyDeepEnsemble}{https://github.com/TSTB-dev/NoisyDeepEnsemble}

Via

Access Paper or Ask Questions

LADMIM: Logical Anomaly Detection with Masked Image Modeling in Discrete Latent Space

Oct 14, 2024

Shunsuke Sakai, Tatushito Hasegawa, Makoto Koshino

Abstract:Detecting anomalies such as incorrect combinations of objects or deviations in their positions is a challenging problem in industrial anomaly detection. Traditional methods mainly focus on local features of normal images, such as scratches and dirt, making detecting anomalies in the relationships between features difficult. Masked image modeling(MIM) is a self-supervised learning technique that predicts the feature representation of masked regions in an image. To reconstruct the masked regions, it is necessary to understand how the image is composed, allowing the learning of relationships between features within the image. We propose a novel approach that leverages the characteristics of MIM to detect logical anomalies effectively. To address blurriness in the reconstructed image, we replace pixel prediction with predicting the probability distribution of discrete latent variables of the masked regions using a tokenizer. We evaluated the proposed method on the MVTecLOCO dataset, achieving an average AUC of 0.867, surpassing traditional reconstruction-based and distillation-based methods.

* Under Review

Via

Access Paper or Ask Questions