Picture for Andrew Saxe

Andrew Saxe

Training Dynamics of In-Context Learning in Linear Attention

Add code
Jan 27, 2025
Viaarxiv icon

Early learning of the optimal constant solution in neural networks and humans

Add code
Jun 25, 2024
Viaarxiv icon

When Are Bias-Free ReLU Networks Like Linear Networks?

Add code
Jun 18, 2024
Viaarxiv icon

Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning

Add code
Jun 10, 2024
Viaarxiv icon

Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks

Add code
Jun 03, 2024
Figure 1 for Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Figure 2 for Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Figure 3 for Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Figure 4 for Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Viaarxiv icon

A Theory of Unimodal Bias in Multimodal Learning

Add code
Dec 01, 2023
Viaarxiv icon

Continual task learning in natural and artificial agents

Add code
Oct 10, 2022
Figure 1 for Continual task learning in natural and artificial agents
Figure 2 for Continual task learning in natural and artificial agents
Figure 3 for Continual task learning in natural and artificial agents
Viaarxiv icon

Know your audience: specializing grounded language models with the game of Dixit

Add code
Jun 16, 2022
Figure 1 for Know your audience: specializing grounded language models with the game of Dixit
Figure 2 for Know your audience: specializing grounded language models with the game of Dixit
Figure 3 for Know your audience: specializing grounded language models with the game of Dixit
Figure 4 for Know your audience: specializing grounded language models with the game of Dixit
Viaarxiv icon

Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation

Add code
May 18, 2022
Figure 1 for Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation
Figure 2 for Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation
Figure 3 for Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation
Figure 4 for Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation
Viaarxiv icon

Modelling continual learning in humans with Hebbian context gating and exponentially decaying task signals

Add code
Mar 22, 2022
Figure 1 for Modelling continual learning in humans with Hebbian context gating and exponentially decaying task signals
Figure 2 for Modelling continual learning in humans with Hebbian context gating and exponentially decaying task signals
Figure 3 for Modelling continual learning in humans with Hebbian context gating and exponentially decaying task signals
Figure 4 for Modelling continual learning in humans with Hebbian context gating and exponentially decaying task signals
Viaarxiv icon