Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sebastiano Stramaglia

Assessing high-order effects in feature importance via predictability decomposition

Dec 13, 2024

Marlis Ontivero-Ortega, Luca Faes, Jesus M Cortes, Daniele Marinazzo, Sebastiano Stramaglia

Figure 1 for Assessing high-order effects in feature importance via predictability decomposition

Figure 2 for Assessing high-order effects in feature importance via predictability decomposition

Figure 3 for Assessing high-order effects in feature importance via predictability decomposition

Figure 4 for Assessing high-order effects in feature importance via predictability decomposition

Abstract:Leveraging the large body of work devoted in recent years to describe redundancy and synergy in multivariate interactions among random variables, we propose a novel approach to quantify cooperative effects in feature importance, one of the most used techniques for explainable artificial intelligence. In particular, we propose an adaptive version of a well-known metric of feature importance, named Leave One Covariate Out (LOCO), to disentangle high-order effects involving a given input feature in regression problems. LOCO is the reduction of the prediction error when the feature under consideration is added to the set of all the features used for regression. Instead of calculating the LOCO using all the features at hand, as in its standard version, our method searches for the multiplet of features that maximize LOCO and for the one that minimize it. This provides a decomposition of the LOCO as the sum of a two-body component and higher-order components (redundant and synergistic), also highlighting the features that contribute to building these high-order effects alongside the driving feature. We report the application to proton/pion discrimination from simulated detector measures by GEANT.

* 11 pages, 3 figures

Via

Access Paper or Ask Questions

Information-Theoretic Progress Measures reveal Grokking is an Emergent Phase Transition

Aug 16, 2024

Kenzo Clauw, Sebastiano Stramaglia, Daniele Marinazzo

Abstract:This paper studies emergent phenomena in neural networks by focusing on grokking where models suddenly generalize after delayed memorization. To understand this phase transition, we utilize higher-order mutual information to analyze the collective behavior (synergy) and shared properties (redundancy) between neurons during training. We identify distinct phases before grokking allowing us to anticipate when it occurs. We attribute grokking to an emergent phase transition caused by the synergistic interactions between neurons as a whole. We show that weight decay and weight initialization can enhance the emergent phase.

* ICML 2024 MI workshop

Via

Access Paper or Ask Questions

Higher-order mutual information reveals synergistic sub-networks for multi-neuron importance

Nov 08, 2022

Kenzo Clauw, Sebastiano Stramaglia, Daniele Marinazzo

Figure 1 for Higher-order mutual information reveals synergistic sub-networks for multi-neuron importance

Figure 2 for Higher-order mutual information reveals synergistic sub-networks for multi-neuron importance

Abstract:Quantifying which neurons are important with respect to the classification decision of a trained neural network is essential for understanding their inner workings. Previous work primarily attributed importance to individual neurons. In this work, we study which groups of neurons contain synergistic or redundant information using a multivariate mutual information method called the O-information. We observe the first layer is dominated by redundancy suggesting general shared features (i.e. detecting edges) while the last layer is dominated by synergy indicating local class-specific features (i.e. concepts). Finally, we show the O-information can be used for multi-neuron importance. This can be demonstrated by re-training a synergistic sub-network, which results in a minimal change in performance. These results suggest our method can be used for pruning and unsupervised representation learning.

* Paper presented at InfoCog @ NeurIPS 2022

Via

Access Paper or Ask Questions

Local Granger Causality

Oct 26, 2020

Sebastiano Stramaglia, Tomas Scagliarini, Yuri Antonacci, Luca Faes

Abstract:Granger causality is a statistical notion of causal influence based on prediction via vector autoregression. For Gaussian variables it is equivalent to transfer entropy, an information-theoretic measure of time-directed information transfer between jointly dependent processes. We exploit such equivalence and calculate exactly the 'local Granger causality', i.e. the profile of the information transfer at each discrete time point in Gaussian processes; in this frame Granger causality is the average of its local version. Our approach offers a robust and computationally fast method to follow the information transfer along the time history of linear stochastic processes, as well as of nonlinear complex systems studied in the Gaussian approximation.

* 4 figures

Via

Access Paper or Ask Questions

Nonlinear parametric model for Granger causality of time series

Feb 07, 2006

Daniele Marinazzo, Mario Pellicoro, Sebastiano Stramaglia

Figure 1 for Nonlinear parametric model for Granger causality of time series

Figure 2 for Nonlinear parametric model for Granger causality of time series

Figure 3 for Nonlinear parametric model for Granger causality of time series

Figure 4 for Nonlinear parametric model for Granger causality of time series

Abstract:We generalize a previously proposed approach for nonlinear Granger causality of time series, based on radial basis function. The proposed model is not constrained to be additive in variables from the two time series and can approximate any function of these variables, still being suitable to evaluate causality. Usefulness of this measure of causality is shown in a physiological example and in the study of the feed-back loop in a model of excitatory and inhibitory neurons.

* 4 pages 5 figures

Via

Access Paper or Ask Questions