Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Francesco Insulla

Towards a Learning Theory of Representation Alignment

Feb 19, 2025

Francesco Insulla, Shuo Huang, Lorenzo Rosasco

Figure 1 for Towards a Learning Theory of Representation Alignment

Abstract:It has recently been argued that AI models' representations are becoming aligned as their scale and performance increase. Empirical analyses have been designed to support this idea and conjecture the possible alignment of different representations toward a shared statistical model of reality. In this paper, we propose a learning-theoretic perspective to representation alignment. First, we review and connect different notions of alignment based on metric, probabilistic, and spectral ideas. Then, we focus on stitching, a particular approach to understanding the interplay between different representations in the context of a task. Our main contribution here is relating properties of stitching to the kernel alignment of the underlying representation. Our results can be seen as a first step toward casting representation alignment as a learning-theoretic problem.

Via

Access Paper or Ask Questions

Phase-aware Training Schedule Simplifies Learning in Flow-Based Generative Models

Dec 10, 2024

Santiago Aranguri, Francesco Insulla

Figure 1 for Phase-aware Training Schedule Simplifies Learning in Flow-Based Generative Models

Figure 2 for Phase-aware Training Schedule Simplifies Learning in Flow-Based Generative Models

Figure 3 for Phase-aware Training Schedule Simplifies Learning in Flow-Based Generative Models

Abstract:We analyze the training of a two-layer autoencoder used to parameterize a flow-based generative model for sampling from a high-dimensional Gaussian mixture. Previous work shows that the phase where the relative probability between the modes is learned disappears as the dimension goes to infinity without an appropriate time schedule. We introduce a time dilation that solves this problem. This enables us to characterize the learned velocity field, finding a first phase where the probability of each mode is learned and a second phase where the variance of each mode is learned. We find that the autoencoder representing the velocity field learns to simplify by estimating only the parameters relevant to each phase. Turning to real data, we propose a method that, for a given feature, finds intervals of time where training improves accuracy the most on that feature. Since practitioners take a uniform distribution over training times, our method enables more efficient training. We provide preliminary experiments validating this approach.

Via

Access Paper or Ask Questions