Picture for Jinyang Guo

Jinyang Guo

LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment

Add code
Oct 28, 2024
Viaarxiv icon

HarmoniCa: Harmonizing Training and Inference for Better Feature Cache in Diffusion Transformer Acceleration

Add code
Oct 02, 2024
Viaarxiv icon

A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms

Add code
Sep 25, 2024
Viaarxiv icon

DDK: Distilling Domain Knowledge for Efficient Large Language Models

Add code
Jul 23, 2024
Viaarxiv icon

QVD: Post-training Quantization for Video Diffusion Models

Add code
Jul 16, 2024
Viaarxiv icon

Fast and Controllable Post-training Sparsity: Learning Optimal Sparsity Allocation with Global Constraint in Minutes

Add code
May 09, 2024
Viaarxiv icon

PTQ4SAM: Post-Training Quantization for Segment Anything

Add code
May 06, 2024
Viaarxiv icon

BinaryDM: Towards Accurate Binarization of Diffusion Model

Add code
Apr 08, 2024
Figure 1 for BinaryDM: Towards Accurate Binarization of Diffusion Model
Figure 2 for BinaryDM: Towards Accurate Binarization of Diffusion Model
Figure 3 for BinaryDM: Towards Accurate Binarization of Diffusion Model
Figure 4 for BinaryDM: Towards Accurate Binarization of Diffusion Model
Viaarxiv icon

DB-LLM: Accurate Dual-Binarization for Efficient LLMs

Add code
Feb 19, 2024
Figure 1 for DB-LLM: Accurate Dual-Binarization for Efficient LLMs
Figure 2 for DB-LLM: Accurate Dual-Binarization for Efficient LLMs
Figure 3 for DB-LLM: Accurate Dual-Binarization for Efficient LLMs
Figure 4 for DB-LLM: Accurate Dual-Binarization for Efficient LLMs
Viaarxiv icon

RobustMQ: Benchmarking Robustness of Quantized Models

Add code
Aug 04, 2023
Viaarxiv icon