Picture for Shankar Krishnan

Shankar Krishnan

On the Inductive Bias of Stacking Towards Improving Reasoning

Add code
Sep 27, 2024
Viaarxiv icon

Benchmarking Neural Network Training Algorithms

Add code
Jun 12, 2023
Figure 1 for Benchmarking Neural Network Training Algorithms
Figure 2 for Benchmarking Neural Network Training Algorithms
Figure 3 for Benchmarking Neural Network Training Algorithms
Figure 4 for Benchmarking Neural Network Training Algorithms
Viaarxiv icon

Adaptive Gradient Methods at the Edge of Stability

Add code
Jul 29, 2022
Figure 1 for Adaptive Gradient Methods at the Edge of Stability
Figure 2 for Adaptive Gradient Methods at the Edge of Stability
Figure 3 for Adaptive Gradient Methods at the Edge of Stability
Figure 4 for Adaptive Gradient Methods at the Edge of Stability
Viaarxiv icon

A Unifying View on Implicit Bias in Training Linear Neural Networks

Add code
Oct 06, 2020
Figure 1 for A Unifying View on Implicit Bias in Training Linear Neural Networks
Figure 2 for A Unifying View on Implicit Bias in Training Linear Neural Networks
Viaarxiv icon

Explaining Memorization and Generalization: A Large-Scale Study with Coherent Gradients

Add code
Mar 16, 2020
Figure 1 for Explaining Memorization and Generalization: A Large-Scale Study with Coherent Gradients
Figure 2 for Explaining Memorization and Generalization: A Large-Scale Study with Coherent Gradients
Figure 3 for Explaining Memorization and Generalization: A Large-Scale Study with Coherent Gradients
Figure 4 for Explaining Memorization and Generalization: A Large-Scale Study with Coherent Gradients
Viaarxiv icon

Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks

Add code
Nov 21, 2019
Figure 1 for Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks
Figure 2 for Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks
Figure 3 for Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks
Figure 4 for Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks
Viaarxiv icon

An Investigation into Neural Net Optimization via Hessian Eigenvalue Density

Add code
Jan 29, 2019
Figure 1 for An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
Figure 2 for An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
Figure 3 for An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
Figure 4 for An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
Viaarxiv icon

Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks

Add code
Dec 08, 2017
Figure 1 for Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks
Figure 2 for Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks
Figure 3 for Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks
Figure 4 for Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks
Viaarxiv icon

Achieving Approximate Soft Clustering in Data Streams

Add code
Jul 26, 2012
Figure 1 for Achieving Approximate Soft Clustering in Data Streams
Figure 2 for Achieving Approximate Soft Clustering in Data Streams
Viaarxiv icon