Picture for Shengyu Liu

Shengyu Liu

LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism

Add code
Apr 15, 2024
Viaarxiv icon

Learning Accurate Performance Predictors for Ultrafast Automated Model Compression

Add code
Apr 13, 2023
Viaarxiv icon