Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Feature emergence via margin maximization: case studies in algebraic tasks

Nov 13, 2023

Depen Morwani, Benjamin L. Edelman, Costin-Andrei Oncescu, Rosie Zhao, Sham Kakade

Figure 1 for Feature emergence via margin maximization: case studies in algebraic tasks

Figure 2 for Feature emergence via margin maximization: case studies in algebraic tasks

Figure 3 for Feature emergence via margin maximization: case studies in algebraic tasks

Figure 4 for Feature emergence via margin maximization: case studies in algebraic tasks

Share this with someone who'll enjoy it:

Abstract:Understanding the internal representations learned by neural networks is a cornerstone challenge in the science of machine learning. While there have been significant recent strides in some cases towards understanding how neural networks implement specific target functions, this paper explores a complementary question -- why do networks arrive at particular computational strategies? Our inquiry focuses on the algebraic learning tasks of modular addition, sparse parities, and finite group operations. Our primary theoretical findings analytically characterize the features learned by stylized neural networks for these algebraic tasks. Notably, our main technique demonstrates how the principle of margin maximization alone can be used to fully specify the features learned by the network. Specifically, we prove that the trained networks utilize Fourier features to perform modular addition and employ features corresponding to irreducible group-theoretic representations to perform compositions in general groups, aligning closely with the empirical observations of Nanda et al. and Chughtai et al. More generally, we hope our techniques can help to foster a deeper understanding of why neural networks adopt specific computational strategies.

View paper on

Share this with someone who'll enjoy it:

Title:Feature emergence via margin maximization: case studies in algebraic tasks

Paper and Code