Picture for Sho Yaida

Sho Yaida

Effective Theory of Transformers at Initialization

Add code
Apr 04, 2023
Viaarxiv icon

Meta-Principled Family of Hyperparameter Scaling Strategies

Add code
Oct 10, 2022
Viaarxiv icon

The Principles of Deep Learning Theory

Add code
Jun 18, 2021
Viaarxiv icon

Non-Gaussian processes and neural networks at finite widths

Add code
Sep 30, 2019
Figure 1 for Non-Gaussian processes and neural networks at finite widths
Viaarxiv icon

Robust Learning with Jacobian Regularization

Add code
Aug 07, 2019
Figure 1 for Robust Learning with Jacobian Regularization
Figure 2 for Robust Learning with Jacobian Regularization
Figure 3 for Robust Learning with Jacobian Regularization
Figure 4 for Robust Learning with Jacobian Regularization
Viaarxiv icon

Fluctuation-dissipation relations for stochastic gradient descent

Add code
Sep 28, 2018
Figure 1 for Fluctuation-dissipation relations for stochastic gradient descent
Figure 2 for Fluctuation-dissipation relations for stochastic gradient descent
Figure 3 for Fluctuation-dissipation relations for stochastic gradient descent
Figure 4 for Fluctuation-dissipation relations for stochastic gradient descent
Viaarxiv icon