Picture for Saurav Muralidharan

Saurav Muralidharan

EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation

Add code
Oct 28, 2024
Viaarxiv icon

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Add code
Sep 26, 2024
Figure 1 for MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Figure 2 for MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Figure 3 for MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Figure 4 for MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Viaarxiv icon

LLM Pruning and Distillation in Practice: The Minitron Approach

Add code
Aug 21, 2024
Figure 1 for LLM Pruning and Distillation in Practice: The Minitron Approach
Figure 2 for LLM Pruning and Distillation in Practice: The Minitron Approach
Figure 3 for LLM Pruning and Distillation in Practice: The Minitron Approach
Figure 4 for LLM Pruning and Distillation in Practice: The Minitron Approach
Viaarxiv icon

Compact Language Models via Pruning and Knowledge Distillation

Add code
Jul 19, 2024
Figure 1 for Compact Language Models via Pruning and Knowledge Distillation
Figure 2 for Compact Language Models via Pruning and Knowledge Distillation
Figure 3 for Compact Language Models via Pruning and Knowledge Distillation
Figure 4 for Compact Language Models via Pruning and Knowledge Distillation
Viaarxiv icon

Flextron: Many-in-One Flexible Large Language Model

Add code
Jun 11, 2024
Figure 1 for Flextron: Many-in-One Flexible Large Language Model
Figure 2 for Flextron: Many-in-One Flexible Large Language Model
Figure 3 for Flextron: Many-in-One Flexible Large Language Model
Figure 4 for Flextron: Many-in-One Flexible Large Language Model
Viaarxiv icon

The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks

Add code
Sep 30, 2023
Viaarxiv icon

Understanding the Effect of the Long Tail on Neural Network Compression

Add code
Jun 27, 2023
Viaarxiv icon

HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity

Add code
May 22, 2023
Viaarxiv icon

Efficient Sparsely Activated Transformers

Add code
Aug 31, 2022
Figure 1 for Efficient Sparsely Activated Transformers
Figure 2 for Efficient Sparsely Activated Transformers
Figure 3 for Efficient Sparsely Activated Transformers
Figure 4 for Efficient Sparsely Activated Transformers
Viaarxiv icon

Reliable Model Compression via Label-Preservation-Aware Loss Functions

Add code
Dec 03, 2020
Figure 1 for Reliable Model Compression via Label-Preservation-Aware Loss Functions
Figure 2 for Reliable Model Compression via Label-Preservation-Aware Loss Functions
Figure 3 for Reliable Model Compression via Label-Preservation-Aware Loss Functions
Figure 4 for Reliable Model Compression via Label-Preservation-Aware Loss Functions
Viaarxiv icon