Picture for Raphaël Berthier

Raphaël Berthier

PSL, SIERRA

Attention layers provably solve single-location regression

Add code
Oct 02, 2024
Figure 1 for Attention layers provably solve single-location regression
Figure 2 for Attention layers provably solve single-location regression
Figure 3 for Attention layers provably solve single-location regression
Figure 4 for Attention layers provably solve single-location regression
Viaarxiv icon

On the Minimal Degree Bias in Generalization on the Unseen for non-Boolean Functions

Add code
Jun 10, 2024
Figure 1 for On the Minimal Degree Bias in Generalization on the Unseen for non-Boolean Functions
Figure 2 for On the Minimal Degree Bias in Generalization on the Unseen for non-Boolean Functions
Figure 3 for On the Minimal Degree Bias in Generalization on the Unseen for non-Boolean Functions
Figure 4 for On the Minimal Degree Bias in Generalization on the Unseen for non-Boolean Functions
Viaarxiv icon

Leveraging the two timescale regime to demonstrate convergence of neural networks

Add code
Apr 19, 2023
Viaarxiv icon

Learning time-scales in two-layers neural networks

Add code
Mar 16, 2023
Viaarxiv icon

Incremental Learning in Diagonal Linear Networks

Add code
Aug 31, 2022
Figure 1 for Incremental Learning in Diagonal Linear Networks
Viaarxiv icon

Graph-based Approximate Message Passing Iterations

Add code
Sep 24, 2021
Viaarxiv icon

A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip

Add code
Jun 10, 2021
Figure 1 for A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip
Figure 2 for A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip
Figure 3 for A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip
Viaarxiv icon

Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

Add code
Jun 15, 2020
Figure 1 for Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model
Figure 2 for Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model
Figure 3 for Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model
Figure 4 for Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model
Viaarxiv icon

Gossip of Statistical Observations using Orthogonal Polynomials

Add code
May 22, 2018
Figure 1 for Gossip of Statistical Observations using Orthogonal Polynomials
Figure 2 for Gossip of Statistical Observations using Orthogonal Polynomials
Figure 3 for Gossip of Statistical Observations using Orthogonal Polynomials
Figure 4 for Gossip of Statistical Observations using Orthogonal Polynomials
Viaarxiv icon