Picture for Zhongwang Zhang

Zhongwang Zhang

Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization

Add code
Jun 26, 2024
Viaarxiv icon

Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation

Add code
May 24, 2024
Figure 1 for Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation
Figure 2 for Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation
Figure 3 for Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation
Figure 4 for Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation
Viaarxiv icon

Initialization is Critical to Whether Transformers Fit Composite Functions by Inference or Memorizing

Add code
May 08, 2024
Viaarxiv icon

Loss Jump During Loss Switch in Solving PDEs with Neural Networks

Add code
May 06, 2024
Viaarxiv icon

Anchor function: a type of benchmark functions for studying language models

Add code
Jan 16, 2024
Viaarxiv icon

Optimistic Estimate Uncovers the Potential of Nonlinear Models

Add code
Jul 18, 2023
Viaarxiv icon

Stochastic Modified Equations and Dynamics of Dropout Algorithm

Add code
May 25, 2023
Viaarxiv icon

Loss Spike in Training Neural Networks

Add code
May 20, 2023
Viaarxiv icon

Linear Stability Hypothesis and Rank Stratification for Nonlinear Models

Add code
Nov 21, 2022
Viaarxiv icon

Implicit regularization of dropout

Add code
Jul 13, 2022
Figure 1 for Implicit regularization of dropout
Figure 2 for Implicit regularization of dropout
Figure 3 for Implicit regularization of dropout
Figure 4 for Implicit regularization of dropout
Viaarxiv icon