Picture for Zhuohan Li

Zhuohan Li

OpenAI o1 System Card

Add code
Dec 21, 2024
Viaarxiv icon

Optimizing Speculative Decoding for Serving Large Language Models Using Goodput

Add code
Jun 20, 2024
Figure 1 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 2 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 3 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 4 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Viaarxiv icon

Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning

Add code
May 11, 2024
Figure 1 for Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning
Figure 2 for Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning
Figure 3 for Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning
Figure 4 for Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning
Viaarxiv icon

Fairness in Serving Large Language Models

Add code
Dec 31, 2023
Figure 1 for Fairness in Serving Large Language Models
Figure 2 for Fairness in Serving Large Language Models
Figure 3 for Fairness in Serving Large Language Models
Figure 4 for Fairness in Serving Large Language Models
Viaarxiv icon

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

Add code
Sep 30, 2023
Figure 1 for LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Figure 2 for LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Figure 3 for LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Figure 4 for LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Viaarxiv icon

Efficient Memory Management for Large Language Model Serving with PagedAttention

Add code
Sep 12, 2023
Viaarxiv icon

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Add code
Jun 09, 2023
Figure 1 for Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Figure 2 for Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Figure 3 for Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Figure 4 for Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Viaarxiv icon

What is the State of Memory Saving for Model Training?

Add code
Mar 26, 2023
Viaarxiv icon

High-throughput Generative Inference of Large Language Models with a Single GPU

Add code
Mar 13, 2023
Figure 1 for High-throughput Generative Inference of Large Language Models with a Single GPU
Figure 2 for High-throughput Generative Inference of Large Language Models with a Single GPU
Figure 3 for High-throughput Generative Inference of Large Language Models with a Single GPU
Figure 4 for High-throughput Generative Inference of Large Language Models with a Single GPU
Viaarxiv icon

AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving

Add code
Feb 22, 2023
Viaarxiv icon