Picture for Junghwan Seo

Junghwan Seo

InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management

Add code
Jun 28, 2024
Viaarxiv icon