Picture for Zhongwang Zhang

Zhongwang Zhang

Complexity Control Facilitates Reasoning-Based Compositional Generalization in Transformers

Add code
Jan 15, 2025
Viaarxiv icon

Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization

Add code
Jun 26, 2024
Viaarxiv icon

Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation

Add code
May 24, 2024
Figure 1 for Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation
Figure 2 for Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation
Figure 3 for Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation
Figure 4 for Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation
Viaarxiv icon

Initialization is Critical to Whether Transformers Fit Composite Functions by Inference or Memorizing

Add code
May 08, 2024
Viaarxiv icon

Loss Jump During Loss Switch in Solving PDEs with Neural Networks

Add code
May 06, 2024
Figure 1 for Loss Jump During Loss Switch in Solving PDEs with Neural Networks
Figure 2 for Loss Jump During Loss Switch in Solving PDEs with Neural Networks
Figure 3 for Loss Jump During Loss Switch in Solving PDEs with Neural Networks
Figure 4 for Loss Jump During Loss Switch in Solving PDEs with Neural Networks
Viaarxiv icon

Anchor function: a type of benchmark functions for studying language models

Add code
Jan 16, 2024
Viaarxiv icon

Optimistic Estimate Uncovers the Potential of Nonlinear Models

Add code
Jul 18, 2023
Viaarxiv icon

Stochastic Modified Equations and Dynamics of Dropout Algorithm

Add code
May 25, 2023
Figure 1 for Stochastic Modified Equations and Dynamics of Dropout Algorithm
Figure 2 for Stochastic Modified Equations and Dynamics of Dropout Algorithm
Figure 3 for Stochastic Modified Equations and Dynamics of Dropout Algorithm
Viaarxiv icon

Loss Spike in Training Neural Networks

Add code
May 20, 2023
Viaarxiv icon

Linear Stability Hypothesis and Rank Stratification for Nonlinear Models

Add code
Nov 21, 2022
Figure 1 for Linear Stability Hypothesis and Rank Stratification for Nonlinear Models
Figure 2 for Linear Stability Hypothesis and Rank Stratification for Nonlinear Models
Figure 3 for Linear Stability Hypothesis and Rank Stratification for Nonlinear Models
Figure 4 for Linear Stability Hypothesis and Rank Stratification for Nonlinear Models
Viaarxiv icon