Picture for Andrew M. Saxe

Andrew M. Saxe

Flexible task abstractions emerge in linear networks with fast and bounded units

Add code
Nov 06, 2024
Viaarxiv icon

From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks

Add code
Sep 22, 2024
Figure 1 for From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Figure 2 for From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Figure 3 for From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Figure 4 for From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Viaarxiv icon

What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation

Add code
Apr 10, 2024
Figure 1 for What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation
Figure 2 for What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation
Figure 3 for What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation
Figure 4 for What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation
Viaarxiv icon

When Representations Align: Universality in Representation Learning Dynamics

Add code
Feb 14, 2024
Viaarxiv icon

The Transient Nature of Emergent In-Context Learning in Transformers

Add code
Nov 15, 2023
Viaarxiv icon

Meta-Learning Strategies through Value Maximization in Neural Networks

Add code
Oct 30, 2023
Viaarxiv icon

Regularised neural networks mimic human insight

Add code
Feb 22, 2023
Viaarxiv icon

The Neural Race Reduction: Dynamics of Abstraction in Gated Networks

Add code
Jul 21, 2022
Figure 1 for The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Figure 2 for The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Figure 3 for The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Figure 4 for The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Viaarxiv icon

Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup

Add code
Jun 18, 2019
Figure 1 for Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup
Figure 2 for Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup
Figure 3 for Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup
Figure 4 for Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup
Viaarxiv icon

Generalisation dynamics of online learning in over-parameterised neural networks

Add code
Jan 25, 2019
Figure 1 for Generalisation dynamics of online learning in over-parameterised neural networks
Figure 2 for Generalisation dynamics of online learning in over-parameterised neural networks
Figure 3 for Generalisation dynamics of online learning in over-parameterised neural networks
Figure 4 for Generalisation dynamics of online learning in over-parameterised neural networks
Viaarxiv icon