Picture for Alexandru Meterez

Alexandru Meterez

The Optimization Landscape of SGD Across the Feature Learning Strength

Add code
Oct 06, 2024
Viaarxiv icon

Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning

Add code
Feb 27, 2024
Viaarxiv icon

Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion

Add code
Oct 03, 2023
Figure 1 for Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion
Figure 2 for Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion
Figure 3 for Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion
Figure 4 for Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion
Viaarxiv icon