Picture for Shengyu Liu

Shengyu Liu

LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism

Add code
Apr 15, 2024
Figure 1 for LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
Figure 2 for LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
Figure 3 for LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
Figure 4 for LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
Viaarxiv icon

Learning Accurate Performance Predictors for Ultrafast Automated Model Compression

Add code
Apr 13, 2023
Viaarxiv icon