Picture for Yunchu Bai

Yunchu Bai

TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection

Add code
Nov 05, 2024
Viaarxiv icon