Picture for Arina Kharlamova

Arina Kharlamova

FlexInfer: Breaking Memory Constraint via Flexible and Efficient Offloading for On-Device LLM Inference

Add code
Mar 04, 2025
Viaarxiv icon