Picture for Hongru Yang

Hongru Yang

Training Dynamics of Transformers to Recognize Word Co-occurrence via Gradient Flow Analysis

Add code
Oct 12, 2024
Figure 1 for Training Dynamics of Transformers to Recognize Word Co-occurrence via Gradient Flow Analysis
Viaarxiv icon

Theoretical Characterization of How Neural Network Pruning Affects its Generalization

Add code
Jan 05, 2023
Viaarxiv icon

Sharper analysis of sparsely activated wide neural networks with trainable biases

Add code
Jan 01, 2023
Viaarxiv icon

On the Neural Tangent Kernel Analysis of Randomly Pruned Wide Neural Networks

Add code
Apr 06, 2022
Figure 1 for On the Neural Tangent Kernel Analysis of Randomly Pruned Wide Neural Networks
Figure 2 for On the Neural Tangent Kernel Analysis of Randomly Pruned Wide Neural Networks
Figure 3 for On the Neural Tangent Kernel Analysis of Randomly Pruned Wide Neural Networks
Figure 4 for On the Neural Tangent Kernel Analysis of Randomly Pruned Wide Neural Networks
Viaarxiv icon