Picture for Wei-Lin Chiang

Wei-Lin Chiang

SkyServe: Serving AI Models across Regions and Clouds with Spot Instances

Add code
Nov 03, 2024
Viaarxiv icon

How to Evaluate Reward Models for RLHF

Add code
Oct 18, 2024
Viaarxiv icon

RouteLLM: Learning to Route LLMs with Preference Data

Add code
Jun 26, 2024
Figure 1 for RouteLLM: Learning to Route LLMs with Preference Data
Figure 2 for RouteLLM: Learning to Route LLMs with Preference Data
Figure 3 for RouteLLM: Learning to Route LLMs with Preference Data
Figure 4 for RouteLLM: Learning to Route LLMs with Preference Data
Viaarxiv icon

From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline

Add code
Jun 17, 2024
Viaarxiv icon

OR-Bench: An Over-Refusal Benchmark for Large Language Models

Add code
May 31, 2024
Viaarxiv icon

Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity

Add code
Apr 22, 2024
Viaarxiv icon

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Add code
Mar 07, 2024
Figure 1 for Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Figure 2 for Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Figure 3 for Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Figure 4 for Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Viaarxiv icon

LLM-Assisted Code Cleaning For Training Accurate Code Generators

Add code
Nov 25, 2023
Figure 1 for LLM-Assisted Code Cleaning For Training Accurate Code Generators
Figure 2 for LLM-Assisted Code Cleaning For Training Accurate Code Generators
Figure 3 for LLM-Assisted Code Cleaning For Training Accurate Code Generators
Figure 4 for LLM-Assisted Code Cleaning For Training Accurate Code Generators
Viaarxiv icon

Rethinking Benchmark and Contamination for Language Models with Rephrased Samples

Add code
Nov 11, 2023
Figure 1 for Rethinking Benchmark and Contamination for Language Models with Rephrased Samples
Figure 2 for Rethinking Benchmark and Contamination for Language Models with Rephrased Samples
Figure 3 for Rethinking Benchmark and Contamination for Language Models with Rephrased Samples
Figure 4 for Rethinking Benchmark and Contamination for Language Models with Rephrased Samples
Viaarxiv icon

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

Add code
Sep 30, 2023
Viaarxiv icon