Picture for Antonio Orvieto

Antonio Orvieto

ETH Zurich

An Uncertainty Principle for Linear Recurrent Neural Networks

Add code
Feb 13, 2025
Viaarxiv icon

When, Where and Why to Average Weights?

Add code
Feb 10, 2025
Viaarxiv icon

Adaptive Methods through the Lens of SDEs: Theoretical Insights on the Role of Noise

Add code
Nov 24, 2024
Viaarxiv icon

NIMBA: Towards Robust and Principled Processing of Point Clouds With SSMs

Add code
Oct 31, 2024
Viaarxiv icon

Loss Landscape Characterization of Neural Networks without Over-Parametrization

Add code
Oct 17, 2024
Figure 1 for Loss Landscape Characterization of Neural Networks without Over-Parametrization
Figure 2 for Loss Landscape Characterization of Neural Networks without Over-Parametrization
Figure 3 for Loss Landscape Characterization of Neural Networks without Over-Parametrization
Figure 4 for Loss Landscape Characterization of Neural Networks without Over-Parametrization
Viaarxiv icon

Geometric Inductive Biases of Deep Networks: The Role of Data and Architecture

Add code
Oct 15, 2024
Viaarxiv icon

An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes

Add code
Jul 05, 2024
Viaarxiv icon

Gradient Descent on Logistic Regression with Non-Separable Data and Large Step Sizes

Add code
Jun 07, 2024
Viaarxiv icon

Recurrent neural networks: vanishing and exploding gradients are not the end of the story

Add code
May 31, 2024
Viaarxiv icon

Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks

Add code
May 24, 2024
Viaarxiv icon