Picture for Eshaan Nichani

Eshaan Nichani

How Transformers Learn Causal Structure with Gradient Descent

Add code
Feb 22, 2024
Viaarxiv icon

Learning Hierarchical Polynomials with Three-Layer Neural Networks

Add code
Nov 23, 2023
Viaarxiv icon

Fine-Tuning Language Models with Just Forward Passes

Add code
May 27, 2023
Viaarxiv icon

Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models

Add code
May 18, 2023
Viaarxiv icon

Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks

Add code
May 11, 2023
Viaarxiv icon

Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability

Add code
Sep 30, 2022
Figure 1 for Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
Figure 2 for Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
Figure 3 for Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
Figure 4 for Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
Viaarxiv icon

Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials

Add code
Jun 08, 2022
Figure 1 for Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Figure 2 for Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Viaarxiv icon

Do Deeper Convolutional Networks Perform Better?

Add code
Oct 19, 2020
Figure 1 for Do Deeper Convolutional Networks Perform Better?
Figure 2 for Do Deeper Convolutional Networks Perform Better?
Figure 3 for Do Deeper Convolutional Networks Perform Better?
Figure 4 for Do Deeper Convolutional Networks Perform Better?
Viaarxiv icon

Balancedness and Alignment are Unlikely in Linear Neural Networks

Add code
Mar 13, 2020
Figure 1 for Balancedness and Alignment are Unlikely in Linear Neural Networks
Figure 2 for Balancedness and Alignment are Unlikely in Linear Neural Networks
Figure 3 for Balancedness and Alignment are Unlikely in Linear Neural Networks
Viaarxiv icon