Picture for Zhuohan Li

Zhuohan Li

Optimizing Speculative Decoding for Serving Large Language Models Using Goodput

Add code
Jun 20, 2024
Viaarxiv icon

Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning

Add code
May 11, 2024
Figure 1 for Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning
Figure 2 for Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning
Figure 3 for Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning
Figure 4 for Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning
Viaarxiv icon

Fairness in Serving Large Language Models

Add code
Dec 31, 2023
Viaarxiv icon

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

Add code
Sep 30, 2023
Viaarxiv icon

Efficient Memory Management for Large Language Model Serving with PagedAttention

Add code
Sep 12, 2023
Viaarxiv icon

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Add code
Jun 09, 2023
Viaarxiv icon

What is the State of Memory Saving for Model Training?

Add code
Mar 26, 2023
Viaarxiv icon

High-throughput Generative Inference of Large Language Models with a Single GPU

Add code
Mar 13, 2023
Viaarxiv icon

AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving

Add code
Feb 22, 2023
Viaarxiv icon

On Optimizing the Communication of Model Parallelism

Add code
Nov 10, 2022
Viaarxiv icon