Picture for Antonio Orvieto

Antonio Orvieto

ETH Zurich

NIMBA: Towards Robust and Principled Processing of Point Clouds With SSMs

Add code
Oct 31, 2024
Viaarxiv icon

Loss Landscape Characterization of Neural Networks without Over-Parametrization

Add code
Oct 17, 2024
Viaarxiv icon

Geometric Inductive Biases of Deep Networks: The Role of Data and Architecture

Add code
Oct 15, 2024
Viaarxiv icon

An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes

Add code
Jul 05, 2024
Viaarxiv icon

Gradient Descent on Logistic Regression with Non-Separable Data and Large Step Sizes

Add code
Jun 07, 2024
Viaarxiv icon

Recurrent neural networks: vanishing and exploding gradients are not the end of the story

Add code
May 31, 2024
Viaarxiv icon

Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks

Add code
May 24, 2024
Viaarxiv icon

On the low-shot transferability of -Mamba

Add code
Mar 15, 2024
Viaarxiv icon

Theoretical Foundations of Deep Selective State-Space Models

Add code
Mar 04, 2024
Viaarxiv icon

Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning

Add code
Feb 27, 2024
Viaarxiv icon