Picture for Yuxin Lai

Yuxin Lai

KunServe: Elastic and Efficient Large Language Model Serving with Parameter-centric Memory Management

Add code
Dec 24, 2024
Viaarxiv icon