Picture for Junxian Guo

Junxian Guo

SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Add code
Nov 07, 2024
Figure 1 for SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Figure 2 for SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Figure 3 for SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Figure 4 for SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Viaarxiv icon

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Add code
Oct 14, 2024
Figure 1 for DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Figure 2 for DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Figure 3 for DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Figure 4 for DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Viaarxiv icon