Picture for Shiwei Liu

Shiwei Liu

AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models

Add code
Oct 14, 2024
Viaarxiv icon

Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models

Add code
Oct 10, 2024
Figure 1 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Figure 2 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Figure 3 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Figure 4 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Viaarxiv icon

Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning

Add code
Oct 09, 2024
Viaarxiv icon

(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork

Add code
Jul 24, 2024
Figure 1 for (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork
Figure 2 for (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork
Figure 3 for (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork
Figure 4 for (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork
Viaarxiv icon

From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients

Add code
Jul 15, 2024
Figure 1 for From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Figure 2 for From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Figure 3 for From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Figure 4 for From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Viaarxiv icon

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

Add code
Jul 11, 2024
Figure 1 for Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Figure 2 for Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Figure 3 for Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Figure 4 for Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Viaarxiv icon

Composable Interventions for Language Models

Add code
Jul 09, 2024
Figure 1 for Composable Interventions for Language Models
Figure 2 for Composable Interventions for Language Models
Figure 3 for Composable Interventions for Language Models
Figure 4 for Composable Interventions for Language Models
Viaarxiv icon

Dynamic Data Pruning for Automatic Speech Recognition

Add code
Jun 26, 2024
Viaarxiv icon

MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization

Add code
Jun 25, 2024
Viaarxiv icon

Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion

Add code
Jun 14, 2024
Viaarxiv icon