Picture for Yefan Zhou

Yefan Zhou

Model Balancing Helps Low-data Training and Fine-tuning

Add code
Oct 16, 2024
Viaarxiv icon

AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality

Add code
Oct 14, 2024
Viaarxiv icon

AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models

Add code
Oct 14, 2024
Figure 1 for AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
Figure 2 for AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
Figure 3 for AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
Figure 4 for AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
Viaarxiv icon

Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance

Add code
Jul 17, 2024
Viaarxiv icon

MD tree: a model-diagnostic tree grown on loss landscape

Add code
Jun 24, 2024
Viaarxiv icon

Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training

Add code
Dec 01, 2023
Viaarxiv icon

A Three-regime Model of Network Pruning

Add code
May 28, 2023
Viaarxiv icon

A Dataset-Dispersion Perspective on Reconstruction Versus Recognition in Single-View 3D Reconstruction Networks

Add code
Nov 30, 2021
Figure 1 for A Dataset-Dispersion Perspective on Reconstruction Versus Recognition in Single-View 3D Reconstruction Networks
Figure 2 for A Dataset-Dispersion Perspective on Reconstruction Versus Recognition in Single-View 3D Reconstruction Networks
Figure 3 for A Dataset-Dispersion Perspective on Reconstruction Versus Recognition in Single-View 3D Reconstruction Networks
Figure 4 for A Dataset-Dispersion Perspective on Reconstruction Versus Recognition in Single-View 3D Reconstruction Networks
Viaarxiv icon