Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Multi-Image Visual Question Answering for Unsupervised Anomaly Detection

Apr 11, 2024

Jun Li, Cosmin I. Bercea, Philip Müller, Lina Felsner, Suhwan Kim, Daniel Rueckert, Benedikt Wiestler, Julia A. Schnabel

Figure 1 for Multi-Image Visual Question Answering for Unsupervised Anomaly Detection

Figure 2 for Multi-Image Visual Question Answering for Unsupervised Anomaly Detection

Figure 3 for Multi-Image Visual Question Answering for Unsupervised Anomaly Detection

Figure 4 for Multi-Image Visual Question Answering for Unsupervised Anomaly Detection

Share this with someone who'll enjoy it:

Abstract:Unsupervised anomaly detection enables the identification of potential pathological areas by juxtaposing original images with their pseudo-healthy reconstructions generated by models trained exclusively on normal images. However, the clinical interpretation of resultant anomaly maps presents a challenge due to a lack of detailed, understandable explanations. Recent advancements in language models have shown the capability of mimicking human-like understanding and providing detailed descriptions. This raises an interesting question: \textit{How can language models be employed to make the anomaly maps more explainable?} To the best of our knowledge, we are the first to leverage a language model for unsupervised anomaly detection, for which we construct a dataset with different questions and answers. Additionally, we present a novel multi-image visual question answering framework tailored for anomaly detection, incorporating diverse feature fusion strategies to enhance visual knowledge extraction. Our experiments reveal that the framework, augmented by our new Knowledge Q-Former module, adeptly answers questions on the anomaly detection dataset. Besides, integrating anomaly maps as inputs distinctly aids in improving the detection of unseen pathologies.

* 13 pages, 8 figures

View paper on

Share this with someone who'll enjoy it:

Title:Multi-Image Visual Question Answering for Unsupervised Anomaly Detection

Paper and Code