Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Teruyuki Katsuoka

Statistical Test for Saliency Maps of Graph Neural Networks via Selective Inference

May 22, 2025

Shuichi Nishino, Tomohiro Shiraishi, Teruyuki Katsuoka, Ichiro Takeuchi

Abstract:Graph Neural Networks (GNNs) have gained prominence for their ability to process graph-structured data across various domains. However, interpreting GNN decisions remains a significant challenge, leading to the adoption of saliency maps for identifying influential nodes and edges. Despite their utility, the reliability of GNN saliency maps has been questioned, particularly in terms of their robustness to noise. In this study, we propose a statistical testing framework to rigorously evaluate the significance of saliency maps. Our main contribution lies in addressing the inflation of the Type I error rate caused by double-dipping of data, leveraging the framework of Selective Inference. Our method provides statistically valid $p$-values while controlling the Type I error rate, ensuring that identified salient subgraphs contain meaningful information rather than random artifacts. To demonstrate the effectiveness of our method, we conduct experiments on both synthetic and real-world datasets, showing its effectiveness in assessing the reliability of GNN interpretations.

Via

Access Paper or Ask Questions

Statistically Significant $k$NNAD by Selective Inference

Feb 18, 2025

Mizuki Niihori, Teruyuki Katsuoka, Tomohiro Shiraishi, Shuichi Nishino, Ichiro Takeuchi

Abstract:In this paper, we investigate the problem of unsupervised anomaly detection using the k-Nearest Neighbor method. The k-Nearest Neighbor Anomaly Detection (kNNAD) is a simple yet effective approach for identifying anomalies across various domains and fields. A critical challenge in anomaly detection, including kNNAD, is appropriately quantifying the reliability of detected anomalies. To address this, we formulate kNNAD as a statistical hypothesis test and quantify the probability of false detection using $p$-values. The main technical challenge lies in performing both anomaly detection and statistical testing on the same data, which hinders correct $p$-value calculation within the conventional statistical testing framework. To resolve this issue, we introduce a statistical hypothesis testing framework called Selective Inference (SI) and propose a method named Statistically Significant NNAD (Stat-kNNAD). By leveraging SI, the Stat-kNNAD method ensures that detected anomalies are statistically significant with theoretical guarantees. The proposed Stat-kNNAD method is applicable to anomaly detection in both the original feature space and latent feature spaces derived from deep learning models. Through numerical experiments on synthetic data and applications to industrial product anomaly detection, we demonstrate the validity and effectiveness of the Stat-kNNAD method.

* 40 pages, 11 figures

Via

Access Paper or Ask Questions

Time Series Anomaly Detection in the Frequency Domain with Statistical Reliability

Feb 05, 2025

Akifumi Yamada, Tomohiro Shiraishi, Shuichi Nishino, Teruyuki Katsuoka, Kouichi Taji, Ichiro Takeuchi

Abstract:Effective anomaly detection in complex systems requires identifying change points (CPs) in the frequency domain, as abnormalities often arise across multiple frequencies. This paper extends recent advancements in statistically significant CP detection, based on Selective Inference (SI), to the frequency domain. The proposed SI method quantifies the statistical significance of detected CPs in the frequency domain using $p$-values, ensuring that the detected changes reflect genuine structural shifts in the target system. We address two major technical challenges to achieve this. First, we extend the existing SI framework to the frequency domain by appropriately utilizing the properties of discrete Fourier transform (DFT). Second, we develop an SI method that provides valid $p$-values for CPs where changes occur across multiple frequencies. Experimental results demonstrate that the proposed method reliably identifies genuine CPs with strong statistical guarantees, enabling more accurate root-cause analysis in the frequency domain of complex systems.

Via

Access Paper or Ask Questions

si4onnx: A Python package for Selective Inference in Deep Learning Models

Jan 29, 2025

Teruyuki Katsuoka, Tomohiro Shiraishi, Daiki Miwa, Shuichi Nishino, Ichiro Takeuchi

Abstract:In this paper, we introduce si4onnx, a package for performing selective inference on deep learning models. Techniques such as CAM in XAI and reconstruction-based anomaly detection using VAE can be interpreted as methods for identifying significant regions within input images. However, the identified regions may not always carry meaningful significance. Therefore, evaluating the statistical significance of these regions represents a crucial challenge in establishing the reliability of AI systems. si4onnx is a Python package that enables straightforward implementation of hypothesis testing with controlled type I error rates through selective inference. It is compatible with deep learning models constructed using common frameworks such as PyTorch and TensorFlow.

* 35pages, 3figures

Via

Access Paper or Ask Questions

Statistical Test for Generated Hypotheses by Diffusion Models

Feb 19, 2024

Teruyuki Katsuoka, Tomohiro Shiraishi, Daiki Miwa, Vo Nguyen Le Duy, Ichiro Takeuchi

Abstract:The enhanced performance of AI has accelerated its integration into scientific research. In particular, the use of generative AI to create scientific hypotheses is promising and is increasingly being applied across various fields. However, when employing AI-generated hypotheses for critical decisions, such as medical diagnoses, verifying their reliability is crucial. In this study, we consider a medical diagnostic task using generated images by diffusion models, and propose a statistical test to quantify its reliability. The basic idea behind the proposed statistical test is to employ a selective inference framework, where we consider a statistical test conditional on the fact that the generated images are produced by a trained diffusion model. Using the proposed method, the statistical reliability of medical image diagnostic results can be quantified in the form of a p-value, allowing for decision-making with a controlled error rate. We show the theoretical validity of the proposed statistical test and its effectiveness through numerical experiments on synthetic and brain image datasets.

* 32pages, 6figures

Via

Access Paper or Ask Questions

Statistical Test for Anomaly Detections by Variational Auto-Encoders

Feb 06, 2024

Daiki Miwa, Tomohiro Shiraishi, Vo Nguyen Le Duy, Teruyuki Katsuoka, Ichiro Takeuchi

Figure 1 for Statistical Test for Anomaly Detections by Variational Auto-Encoders

Figure 2 for Statistical Test for Anomaly Detections by Variational Auto-Encoders

Figure 3 for Statistical Test for Anomaly Detections by Variational Auto-Encoders

Figure 4 for Statistical Test for Anomaly Detections by Variational Auto-Encoders

Abstract:In this study, we consider the reliability assessment of anomaly detection (AD) using Variational Autoencoder (VAE). Over the last decade, VAE-based AD has been actively studied in various perspective, from method development to applied research. However, when the results of ADs are used in high-stakes decision-making, such as in medical diagnosis, it is necessary to ensure the reliability of the detected anomalies. In this study, we propose the VAE-AD Test as a method for quantifying the statistical reliability of VAE-based AD within the framework of statistical testing. Using the VAE-AD Test, the reliability of the anomaly regions detected by a VAE can be quantified in the form of p-values. This means that if an anomaly is declared when the p-value is below a certain threshold, it is possible to control the probability of false detection to a desired level. Since the VAE-AD Test is constructed based on a new statistical inference framework called selective inference, its validity is theoretically guaranteed in finite samples. To demonstrate the validity and effectiveness of the proposed VAE-AD Test, numerical experiments on artificial data and applications to brain image analysis are conducted.

Via

Access Paper or Ask Questions

Statistical Test for Attention Map in Vision Transformer

Jan 19, 2024

Tomohiro Shiraishi, Daiki Miwa, Teruyuki Katsuoka, Vo Nguyen Le Duy, Kouichi Taji, Ichiro Takeuchi

Abstract:The Vision Transformer (ViT) demonstrates exceptional performance in various computer vision tasks. Attention is crucial for ViT to capture complex wide-ranging relationships among image patches, allowing the model to weigh the importance of image patches and aiding our understanding of the decision-making process. However, when utilizing the attention of ViT as evidence in high-stakes decision-making tasks such as medical diagnostics, a challenge arises due to the potential of attention mechanisms erroneously focusing on irrelevant regions. In this study, we propose a statistical test for ViT's attentions, enabling us to use the attentions as reliable quantitative evidence indicators for ViT's decision-making with a rigorously controlled error rate. Using the framework called selective inference, we quantify the statistical significance of attentions in the form of p-values, which enables the theoretically grounded quantification of the false positive detection probability of attentions. We demonstrate the validity and the effectiveness of the proposed method through numerical experiments and applications to brain image diagnoses.

* 42pages, 17figures

Via

Access Paper or Ask Questions