Abstract:Understanding the structure of neural network loss surfaces, particularly the emergence of low-loss tunnels, is critical for advancing neural network theory and practice. In this paper, we propose a novel approach to directly embed loss tunnels into the loss landscape of neural networks. Exploring the properties of these loss tunnels offers new insights into their length and structure and sheds light on some common misconceptions. We then apply our approach to Bayesian neural networks, where we improve subspace inference by identifying pitfalls and proposing a more natural prior that better guides the sampling procedure.
Abstract:The recently developed Prior-Data Fitted Networks (PFNs) have shown very promising results for applications in low-data regimes. The TabPFN model, a special case of PFNs for tabular data, is able to achieve state-of-the-art performance on a variety of classification tasks while producing posterior predictive distributions in mere seconds by in-context learning without the need for learning parameters or hyperparameter tuning. This makes TabPFN a very attractive option for a wide range of domain applications. However, a major drawback of the method is its lack of interpretability. Therefore, we propose several adaptations of popular interpretability methods that we specifically design for TabPFN. By taking advantage of the unique properties of the model, our adaptations allow for more efficient computations than existing implementations. In particular, we show how in-context learning facilitates the estimation of Shapley values by avoiding approximate retraining and enables the use of Leave-One-Covariate-Out (LOCO) even when working with large-scale Transformers. In addition, we demonstrate how data valuation methods can be used to address scalability challenges of TabPFN. Our proposed methods are implemented in a package tabpfn_iml and made available at https://github.com/david-rundel/tabpfn_iml.