Picture for Yaoyu Zhang

Yaoyu Zhang

Complexity Control Facilitates Reasoning-Based Compositional Generalization in Transformers

Add code
Jan 15, 2025
Viaarxiv icon

Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization

Add code
Jun 26, 2024
Viaarxiv icon

Geometry of Critical Sets and Existence of Saddle Branches for Two-layer Neural Networks

Add code
May 26, 2024
Viaarxiv icon

A rationale from frequency perspective for grokking in training neural network

Add code
May 24, 2024
Figure 1 for A rationale from frequency perspective for grokking in training neural network
Figure 2 for A rationale from frequency perspective for grokking in training neural network
Figure 3 for A rationale from frequency perspective for grokking in training neural network
Figure 4 for A rationale from frequency perspective for grokking in training neural network
Viaarxiv icon

Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation

Add code
May 24, 2024
Figure 1 for Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation
Figure 2 for Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation
Figure 3 for Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation
Figure 4 for Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation
Viaarxiv icon

Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion

Add code
May 22, 2024
Figure 1 for Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion
Figure 2 for Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion
Figure 3 for Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion
Figure 4 for Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion
Viaarxiv icon

Disentangle Sample Size and Initialization Effect on Perfect Generalization for Single-Neuron Target

Add code
May 22, 2024
Figure 1 for Disentangle Sample Size and Initialization Effect on Perfect Generalization for Single-Neuron Target
Figure 2 for Disentangle Sample Size and Initialization Effect on Perfect Generalization for Single-Neuron Target
Figure 3 for Disentangle Sample Size and Initialization Effect on Perfect Generalization for Single-Neuron Target
Figure 4 for Disentangle Sample Size and Initialization Effect on Perfect Generalization for Single-Neuron Target
Viaarxiv icon

Initialization is Critical to Whether Transformers Fit Composite Functions by Inference or Memorizing

Add code
May 08, 2024
Viaarxiv icon

Structure and Gradient Dynamics Near Global Minima of Two-layer Neural Networks

Add code
Sep 01, 2023
Viaarxiv icon

Optimistic Estimate Uncovers the Potential of Nonlinear Models

Add code
Jul 18, 2023
Viaarxiv icon