Picture for Ruihao Gong

Ruihao Gong

HarmoniCa: Harmonizing Training and Inference for Better Feature Cache in Diffusion Transformer Acceleration

Add code
Oct 02, 2024
Viaarxiv icon

A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms

Add code
Sep 25, 2024
Viaarxiv icon

OmniBal: Towards Fast Instruct-tuning for Vision-Language Models via Omniverse Computation Balance

Add code
Jul 30, 2024
Viaarxiv icon

Temporal Feature Matters: A Framework for Diffusion Model Quantization

Add code
Jul 28, 2024
Viaarxiv icon

ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models

Add code
Jun 13, 2024
Viaarxiv icon

Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection

Add code
May 10, 2024
Viaarxiv icon

LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models

Add code
May 09, 2024
Viaarxiv icon

Fast and Controllable Post-training Sparsity: Learning Optimal Sparsity Allocation with Global Constraint in Minutes

Add code
May 09, 2024
Viaarxiv icon

2023 Low-Power Computer Vision Challenge (LPCVC) Summary

Add code
Mar 11, 2024
Viaarxiv icon

ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding

Add code
Feb 21, 2024
Figure 1 for ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding
Figure 2 for ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding
Figure 3 for ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding
Figure 4 for ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding
Viaarxiv icon