Picture for Sarang Kim

Sarang Kim

QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference

Add code
Feb 15, 2024
Viaarxiv icon