Picture for Bingyang Wu

Bingyang Wu

LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism

Add code
Apr 15, 2024
Viaarxiv icon

A Survey of Resource-efficient LLM and Multimodal Foundation Models

Add code
Jan 16, 2024
Figure 1 for A Survey of Resource-efficient LLM and Multimodal Foundation Models
Figure 2 for A Survey of Resource-efficient LLM and Multimodal Foundation Models
Figure 3 for A Survey of Resource-efficient LLM and Multimodal Foundation Models
Figure 4 for A Survey of Resource-efficient LLM and Multimodal Foundation Models
Viaarxiv icon

Fast Distributed Inference Serving for Large Language Models

Add code
May 10, 2023
Viaarxiv icon