Picture for Jeremy Bernstein

Jeremy Bernstein

Modular Duality in Deep Learning

Add code
Oct 28, 2024
Viaarxiv icon

Old Optimizer, New Norm: An Anthology

Add code
Sep 30, 2024
Viaarxiv icon

Scalable Optimization in the Modular Norm

Add code
May 23, 2024
Figure 1 for Scalable Optimization in the Modular Norm
Figure 2 for Scalable Optimization in the Modular Norm
Figure 3 for Scalable Optimization in the Modular Norm
Figure 4 for Scalable Optimization in the Modular Norm
Viaarxiv icon

Training Neural Networks from Scratch with Parallel Low-Rank Adapters

Add code
Feb 26, 2024
Viaarxiv icon

A Spectral Condition for Feature Learning

Add code
Oct 26, 2023
Figure 1 for A Spectral Condition for Feature Learning
Figure 2 for A Spectral Condition for Feature Learning
Figure 3 for A Spectral Condition for Feature Learning
Figure 4 for A Spectral Condition for Feature Learning
Viaarxiv icon

SketchOGD: Memory-Efficient Continual Learning

Add code
May 25, 2023
Viaarxiv icon

Automatic Gradient Descent: Deep Learning without Hyperparameters

Add code
Apr 11, 2023
Viaarxiv icon

Optimisation & Generalisation in Networks of Neurons

Add code
Oct 18, 2022
Figure 1 for Optimisation & Generalisation in Networks of Neurons
Figure 2 for Optimisation & Generalisation in Networks of Neurons
Figure 3 for Optimisation & Generalisation in Networks of Neurons
Figure 4 for Optimisation & Generalisation in Networks of Neurons
Viaarxiv icon

Investigating Generalization by Controlling Normalized Margin

Add code
May 08, 2022
Figure 1 for Investigating Generalization by Controlling Normalized Margin
Figure 2 for Investigating Generalization by Controlling Normalized Margin
Figure 3 for Investigating Generalization by Controlling Normalized Margin
Figure 4 for Investigating Generalization by Controlling Normalized Margin
Viaarxiv icon

On the Implicit Biases of Architecture & Gradient Descent

Add code
Oct 08, 2021
Figure 1 for On the Implicit Biases of Architecture & Gradient Descent
Figure 2 for On the Implicit Biases of Architecture & Gradient Descent
Figure 3 for On the Implicit Biases of Architecture & Gradient Descent
Viaarxiv icon