Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Saurabh Khanduja

Neural Response Interpretation through the Lens of Critical Pathways

Mar 31, 2021

Ashkan Khakzar, Soroosh Baselizadeh, Saurabh Khanduja, Christian Rupprecht, Seong Tae Kim, Nassir Navab

Figure 1 for Neural Response Interpretation through the Lens of Critical Pathways

Figure 2 for Neural Response Interpretation through the Lens of Critical Pathways

Figure 3 for Neural Response Interpretation through the Lens of Critical Pathways

Figure 4 for Neural Response Interpretation through the Lens of Critical Pathways

Abstract:Is critical input information encoded in specific sparse pathways within the neural network? In this work, we discuss the problem of identifying these critical pathways and subsequently leverage them for interpreting the network's response to an input. The pruning objective -- selecting the smallest group of neurons for which the response remains equivalent to the original network -- has been previously proposed for identifying critical pathways. We demonstrate that sparse pathways derived from pruning do not necessarily encode critical input information. To ensure sparse pathways include critical fragments of the encoded input information, we propose pathway selection via neurons' contribution to the response. We proceed to explain how critical pathways can reveal critical input features. We prove that pathways selected via neuron contribution are locally linear (in an L2-ball), a property that we use for proposing a feature attribution method: "pathway gradient". We validate our interpretation method using mainstream evaluation experiments. The validation of pathway gradient interpretation method further confirms that selected pathways using neuron contributions correspond to critical input features. The code is publicly available.

* Accepted at CVPR 2021 (IEEE/CVF Conference on Computer Vision and Pattern Recognition)

Via

Access Paper or Ask Questions

Explaining Neural Networks via Perturbing Important Learned Features

Nov 25, 2019

Ashkan Khakzar, Soroosh Baselizadeh, Saurabh Khanduja, Seong Tae Kim, Nassir Navab

Figure 1 for Explaining Neural Networks via Perturbing Important Learned Features

Figure 2 for Explaining Neural Networks via Perturbing Important Learned Features

Figure 3 for Explaining Neural Networks via Perturbing Important Learned Features

Figure 4 for Explaining Neural Networks via Perturbing Important Learned Features

Abstract:Attributing the output of a neural network to the contribution of given input elements is one way of shedding light on the black box nature of neural networks. We propose a novel input feature attribution method that finds an input perturbation that maximally changes the output neuron by exclusively perturbing important hidden neurons (i.e. learned features) on the path to output neuron. Given an input, this is achieved by 1) pruning unimportant neurons, and subsequently 2) finding a local input perturbation that maximizes the output in the pruned network. Since our method considers the importance of hidden neurons (high-level features), it inherently considers interdependencies between multiple input elements, which is vital for input feature attribution. We propose PruneGrad, an efficient gradient-based solution for the pruning and perturbation steps of our method. The efficacy of our method is evaluated by quantitatively benchmarking against other attribution methods using 1) sanity checks, 2) pixel perturbation, and 3) Remove and Retrain (ROAR). Our results show that while most of the existing attribution methods are prone to fail or get mediocre results in at least one benchmark, our proposed method achieves state of the art results in all three benchmarks. The results are further supported by comparative visual evaluation.

Via

Access Paper or Ask Questions