Picture for Lorenzo Noci

Lorenzo Noci

Understanding and Minimising Outlier Features in Neural Network Training

Add code
May 29, 2024
Viaarxiv icon

Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning

Add code
Feb 27, 2024
Viaarxiv icon

How Good is a Single Basin?

Add code
Feb 05, 2024
Viaarxiv icon

Disentangling Linear Mode-Connectivity

Add code
Dec 15, 2023
Viaarxiv icon

Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit

Add code
Sep 28, 2023
Viaarxiv icon

The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit

Add code
Jun 30, 2023
Figure 1 for The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Figure 2 for The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Figure 3 for The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Figure 4 for The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Viaarxiv icon

Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers

Add code
May 25, 2023
Viaarxiv icon

Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning

Add code
Mar 31, 2023
Viaarxiv icon

The Curious Case of Benign Memorization

Add code
Oct 25, 2022
Viaarxiv icon

Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse

Add code
Jun 07, 2022
Figure 1 for Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
Figure 2 for Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
Figure 3 for Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
Figure 4 for Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
Viaarxiv icon