Picture for Yifan Tan

Yifan Tan

AlignedKV: Reducing Memory Access of KV-Cache with Precision-Aligned Quantization

Add code
Sep 25, 2024
Viaarxiv icon