Picture for Yonghao Zhuang

Yonghao Zhuang

Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs

Add code
Jun 03, 2024
Figure 1 for Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs
Figure 2 for Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs
Figure 3 for Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs
Figure 4 for Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs
Viaarxiv icon

Toward Inference-optimal Mixture-of-Expert Large Language Models

Add code
Apr 03, 2024
Figure 1 for Toward Inference-optimal Mixture-of-Expert Large Language Models
Figure 2 for Toward Inference-optimal Mixture-of-Expert Large Language Models
Figure 3 for Toward Inference-optimal Mixture-of-Expert Large Language Models
Figure 4 for Toward Inference-optimal Mixture-of-Expert Large Language Models
Viaarxiv icon

LLM360: Towards Fully Transparent Open-Source LLMs

Add code
Dec 11, 2023
Viaarxiv icon

Redco: A Lightweight Tool to Automate Distributed Training of LLMs on Any GPU/TPUs

Add code
Oct 25, 2023
Viaarxiv icon

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

Add code
Sep 30, 2023
Viaarxiv icon

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Add code
Jun 09, 2023
Viaarxiv icon

On Optimizing the Communication of Model Parallelism

Add code
Nov 10, 2022
Viaarxiv icon

Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning

Add code
Jan 28, 2022
Figure 1 for Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
Figure 2 for Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
Figure 3 for Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
Figure 4 for Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
Viaarxiv icon