Picture for Shiwei Liu

Shiwei Liu

Condense, Don't Just Prune: Enhancing Efficiency and Performance in MoE Layer Pruning

Add code
Nov 26, 2024
Figure 1 for Condense, Don't Just Prune: Enhancing Efficiency and Performance in MoE Layer Pruning
Figure 2 for Condense, Don't Just Prune: Enhancing Efficiency and Performance in MoE Layer Pruning
Figure 3 for Condense, Don't Just Prune: Enhancing Efficiency and Performance in MoE Layer Pruning
Figure 4 for Condense, Don't Just Prune: Enhancing Efficiency and Performance in MoE Layer Pruning
Viaarxiv icon

AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models

Add code
Oct 14, 2024
Figure 1 for AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
Figure 2 for AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
Figure 3 for AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
Figure 4 for AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
Viaarxiv icon

Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models

Add code
Oct 10, 2024
Figure 1 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Figure 2 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Figure 3 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Figure 4 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Viaarxiv icon

Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning

Add code
Oct 09, 2024
Figure 1 for Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning
Figure 2 for Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning
Figure 3 for Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning
Figure 4 for Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning
Viaarxiv icon

(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork

Add code
Jul 24, 2024
Figure 1 for (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork
Figure 2 for (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork
Figure 3 for (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork
Figure 4 for (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork
Viaarxiv icon

From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients

Add code
Jul 15, 2024
Figure 1 for From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Figure 2 for From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Figure 3 for From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Figure 4 for From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Viaarxiv icon

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

Add code
Jul 11, 2024
Figure 1 for Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Figure 2 for Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Figure 3 for Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Figure 4 for Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Viaarxiv icon

Composable Interventions for Language Models

Add code
Jul 09, 2024
Figure 1 for Composable Interventions for Language Models
Figure 2 for Composable Interventions for Language Models
Figure 3 for Composable Interventions for Language Models
Figure 4 for Composable Interventions for Language Models
Viaarxiv icon

Dynamic Data Pruning for Automatic Speech Recognition

Add code
Jun 26, 2024
Viaarxiv icon

MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization

Add code
Jun 25, 2024
Figure 1 for MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Figure 2 for MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Figure 3 for MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Figure 4 for MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Viaarxiv icon