Picture for Dmitry Yarotsky

Dmitry Yarotsky

SGD with memory: fundamental properties and stochastic acceleration

Add code
Oct 05, 2024
Figure 1 for SGD with memory: fundamental properties and stochastic acceleration
Figure 2 for SGD with memory: fundamental properties and stochastic acceleration
Figure 3 for SGD with memory: fundamental properties and stochastic acceleration
Figure 4 for SGD with memory: fundamental properties and stochastic acceleration
Viaarxiv icon

Generalization error of spectral algorithms

Add code
Mar 18, 2024
Viaarxiv icon

Learning high-dimensional targets by two-parameter models and gradient flow

Add code
Feb 26, 2024
Viaarxiv icon

Structure of universal formulas

Add code
Nov 07, 2023
Viaarxiv icon

A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta

Add code
Jun 22, 2022
Figure 1 for A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta
Figure 2 for A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta
Figure 3 for A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta
Figure 4 for A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta
Viaarxiv icon

Embedded Ensembles: Infinite Width Limit and Operating Regimes

Add code
Feb 24, 2022
Figure 1 for Embedded Ensembles: Infinite Width Limit and Operating Regimes
Figure 2 for Embedded Ensembles: Infinite Width Limit and Operating Regimes
Figure 3 for Embedded Ensembles: Infinite Width Limit and Operating Regimes
Figure 4 for Embedded Ensembles: Infinite Width Limit and Operating Regimes
Viaarxiv icon

Tight Convergence Rate Bounds for Optimization Under Power Law Spectral Conditions

Add code
Feb 02, 2022
Viaarxiv icon

Universal scaling laws in the gradient descent training of neural networks

Add code
May 02, 2021
Figure 1 for Universal scaling laws in the gradient descent training of neural networks
Figure 2 for Universal scaling laws in the gradient descent training of neural networks
Figure 3 for Universal scaling laws in the gradient descent training of neural networks
Figure 4 for Universal scaling laws in the gradient descent training of neural networks
Viaarxiv icon

Elementary superexpressive activations

Add code
Feb 22, 2021
Figure 1 for Elementary superexpressive activations
Figure 2 for Elementary superexpressive activations
Figure 3 for Elementary superexpressive activations
Viaarxiv icon

Low-loss connection of weight vectors: distribution-based approaches

Add code
Aug 03, 2020
Figure 1 for Low-loss connection of weight vectors: distribution-based approaches
Figure 2 for Low-loss connection of weight vectors: distribution-based approaches
Figure 3 for Low-loss connection of weight vectors: distribution-based approaches
Figure 4 for Low-loss connection of weight vectors: distribution-based approaches
Viaarxiv icon