Picture for Zhenglun Kong

Zhenglun Kong

Fully Open Source Moxin-7B Technical Report

Add code
Dec 08, 2024
Viaarxiv icon

Fast and Memory-Efficient Video Diffusion Using Streamlined Inference

Add code
Nov 02, 2024
Figure 1 for Fast and Memory-Efficient Video Diffusion Using Streamlined Inference
Figure 2 for Fast and Memory-Efficient Video Diffusion Using Streamlined Inference
Figure 3 for Fast and Memory-Efficient Video Diffusion Using Streamlined Inference
Figure 4 for Fast and Memory-Efficient Video Diffusion Using Streamlined Inference
Viaarxiv icon

Pruning Foundation Models for High Accuracy without Retraining

Add code
Oct 21, 2024
Figure 1 for Pruning Foundation Models for High Accuracy without Retraining
Figure 2 for Pruning Foundation Models for High Accuracy without Retraining
Figure 3 for Pruning Foundation Models for High Accuracy without Retraining
Viaarxiv icon

Rethinking Token Reduction for State Space Models

Add code
Oct 16, 2024
Figure 1 for Rethinking Token Reduction for State Space Models
Figure 2 for Rethinking Token Reduction for State Space Models
Figure 3 for Rethinking Token Reduction for State Space Models
Figure 4 for Rethinking Token Reduction for State Space Models
Viaarxiv icon

Exploring Token Pruning in Vision State Space Models

Add code
Sep 27, 2024
Figure 1 for Exploring Token Pruning in Vision State Space Models
Figure 2 for Exploring Token Pruning in Vision State Space Models
Figure 3 for Exploring Token Pruning in Vision State Space Models
Figure 4 for Exploring Token Pruning in Vision State Space Models
Viaarxiv icon

Search for Efficient Large Language Models

Add code
Sep 25, 2024
Figure 1 for Search for Efficient Large Language Models
Figure 2 for Search for Efficient Large Language Models
Figure 3 for Search for Efficient Large Language Models
Figure 4 for Search for Efficient Large Language Models
Viaarxiv icon

Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers

Add code
Jul 25, 2024
Figure 1 for Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers
Figure 2 for Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers
Figure 3 for Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers
Figure 4 for Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers
Viaarxiv icon

Efficient Pruning of Large Language Model with Adaptive Estimation Fusion

Add code
Mar 16, 2024
Viaarxiv icon

EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge

Add code
Feb 16, 2024
Viaarxiv icon

Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge

Add code
Dec 09, 2023
Viaarxiv icon