Abstract:This paper introduces a general method for the exploration of equivalence classes in the input space of Transformer models. The proposed approach is based on sound mathematical theory which describes the internal layers of a Transformer architecture as sequential deformations of the input manifold. Using eigendecomposition of the pullback of the distance metric defined on the output space through the Jacobian of the model, we are able to reconstruct equivalence classes in the input space and navigate across them. We illustrate how this method can be used as a powerful tool for investigating how a Transformer sees the input space, facilitating local and task-agnostic explainability in Computer Vision and Natural Language Processing tasks.
Abstract:Neural networks are playing a crucial role in everyday life, with the most modern generative models able to achieve impressive results. Nonetheless, their functioning is still not very clear, and several strategies have been adopted to study how and why these model reach their outputs. A common approach is to consider the data in an Euclidean settings: recent years has witnessed instead a shift from this paradigm, moving thus to more general framework, namely Riemannian Geometry. Two recent works introduced a geometric framework to study neural networks making use of singular Riemannian metrics. In this paper we extend these results to convolutional, residual and recursive neural networks, studying also the case of non-differentiable activation functions, such as ReLU. We illustrate our findings with some numerical experiments on classification of images and thermodynamic problems.
Abstract:In a previous work, we proposed a geometric framework to study a deep neural network, seen as sequence of maps between manifolds, employing singular Riemannian geometry. In this paper, we present an application of this framework, proposing a way to build the class of equivalence of an input point: such class is defined as the set of the points on the input manifold mapped to the same output by the neural network. In other words, we build the preimage of a point in the output manifold in the input space. In particular. we focus for simplicity on the case of neural networks maps from n-dimensional real spaces to (n - 1)-dimensional real spaces, we propose an algorithm allowing to build the set of points lying on the same class of equivalence. This approach leads to two main applications: the generation of new synthetic data and it may provides some insights on how a classifier can be confused by small perturbation on the input data (e.g. a penguin image classified as an image containing a chihuahua). In addition, for neural networks from 2D to 1D real spaces, we also discuss how to find the preimages of closed intervals of the real line. We also present some numerical experiments with several neural networks trained to perform non-linear regression tasks, including the case of a binary classifier.
Abstract:Deep Neural Networks are widely used for solving complex problems in several scientific areas, such as speech recognition, machine translation, image analysis. The strategies employed to investigate their theoretical properties mainly rely on Euclidean geometry, but in the last years new approaches based on Riemannian geometry have been developed. Motivated by some open problems, we study a particular sequence of maps between manifolds, with the last manifold of the sequence equipped with a Riemannian metric. We investigate the structures induced trough pullbacks on the other manifolds of the sequence and on some related quotients. In particular, we show that the pullbacks of the final Riemannian metric to any manifolds of the sequence is a degenerate Riemannian metric inducing a structure of pseudometric space, we show that the Kolmogorov quotient of this pseudometric space yields a smooth manifold, which is the base space of a particular vertical bundle. We investigate the theoretical properties of the maps of such sequence, eventually we focus on the case of maps between manifolds implementing neural networks of practical interest and we present some applications of the geometric framework we introduced in the first part of the paper.