Picture for Diego Granziol

Diego Granziol

Compute-Optimal LLMs Provably Generalize Better With Scale

Add code
Apr 21, 2025
Viaarxiv icon

Universal characteristics of deep neural network loss surfaces from random matrix theory

Add code
May 17, 2022
Figure 1 for Universal characteristics of deep neural network loss surfaces from random matrix theory
Figure 2 for Universal characteristics of deep neural network loss surfaces from random matrix theory
Figure 3 for Universal characteristics of deep neural network loss surfaces from random matrix theory
Figure 4 for Universal characteristics of deep neural network loss surfaces from random matrix theory
Viaarxiv icon

Applicability of Random Matrix Theory in Deep Learning

Add code
Feb 12, 2021
Figure 1 for Applicability of Random Matrix Theory in Deep Learning
Figure 2 for Applicability of Random Matrix Theory in Deep Learning
Figure 3 for Applicability of Random Matrix Theory in Deep Learning
Figure 4 for Applicability of Random Matrix Theory in Deep Learning
Viaarxiv icon

Explaining the Adaptive Generalisation Gap

Add code
Nov 15, 2020
Figure 1 for Explaining the Adaptive Generalisation Gap
Figure 2 for Explaining the Adaptive Generalisation Gap
Figure 3 for Explaining the Adaptive Generalisation Gap
Figure 4 for Explaining the Adaptive Generalisation Gap
Viaarxiv icon

Curvature is Key: Sub-Sampled Loss Surfaces and the Implications for Large Batch Training

Add code
Jun 16, 2020
Figure 1 for Curvature is Key: Sub-Sampled Loss Surfaces and the Implications for Large Batch Training
Figure 2 for Curvature is Key: Sub-Sampled Loss Surfaces and the Implications for Large Batch Training
Figure 3 for Curvature is Key: Sub-Sampled Loss Surfaces and the Implications for Large Batch Training
Figure 4 for Curvature is Key: Sub-Sampled Loss Surfaces and the Implications for Large Batch Training
Viaarxiv icon

Flatness is a False Friend

Add code
Jun 16, 2020
Figure 1 for Flatness is a False Friend
Figure 2 for Flatness is a False Friend
Figure 3 for Flatness is a False Friend
Figure 4 for Flatness is a False Friend
Viaarxiv icon

Beyond Random Matrix Theory for Deep Networks

Add code
Jun 13, 2020
Figure 1 for Beyond Random Matrix Theory for Deep Networks
Figure 2 for Beyond Random Matrix Theory for Deep Networks
Figure 3 for Beyond Random Matrix Theory for Deep Networks
Figure 4 for Beyond Random Matrix Theory for Deep Networks
Viaarxiv icon

Iterate Averaging Helps: An Alternative Perspective in Deep Learning

Add code
Mar 02, 2020
Figure 1 for Iterate Averaging Helps: An Alternative Perspective in Deep Learning
Figure 2 for Iterate Averaging Helps: An Alternative Perspective in Deep Learning
Figure 3 for Iterate Averaging Helps: An Alternative Perspective in Deep Learning
Figure 4 for Iterate Averaging Helps: An Alternative Perspective in Deep Learning
Viaarxiv icon

MLRG Deep Curvature

Add code
Dec 20, 2019
Figure 1 for MLRG Deep Curvature
Figure 2 for MLRG Deep Curvature
Figure 3 for MLRG Deep Curvature
Figure 4 for MLRG Deep Curvature
Viaarxiv icon

A Maximum Entropy approach to Massive Graph Spectra

Add code
Dec 19, 2019
Figure 1 for A Maximum Entropy approach to Massive Graph Spectra
Figure 2 for A Maximum Entropy approach to Massive Graph Spectra
Figure 3 for A Maximum Entropy approach to Massive Graph Spectra
Figure 4 for A Maximum Entropy approach to Massive Graph Spectra
Viaarxiv icon