Picture for Junxian Guo

Junxian Guo

SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Add code
Nov 07, 2024
Viaarxiv icon

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Add code
Oct 14, 2024
Viaarxiv icon