Picture for Ion Stoica

Ion Stoica

Pie: Pooling CPU Memory for LLM Inference

Add code
Nov 14, 2024
Viaarxiv icon

SkyServe: Serving AI Models across Regions and Clouds with Spot Instances

Add code
Nov 03, 2024
Viaarxiv icon

NEO: Saving GPU Memory Crisis with CPU Offloading for Online LLM Inference

Add code
Nov 02, 2024
Viaarxiv icon

Managing Bandwidth: The Key to Cloud-Assisted Autonomous Driving

Add code
Oct 21, 2024
Viaarxiv icon

How to Evaluate Reward Models for RLHF

Add code
Oct 18, 2024
Viaarxiv icon

JudgeBench: A Benchmark for Evaluating LLM-based Judges

Add code
Oct 16, 2024
Figure 1 for JudgeBench: A Benchmark for Evaluating LLM-based Judges
Figure 2 for JudgeBench: A Benchmark for Evaluating LLM-based Judges
Figure 3 for JudgeBench: A Benchmark for Evaluating LLM-based Judges
Figure 4 for JudgeBench: A Benchmark for Evaluating LLM-based Judges
Viaarxiv icon

Efficient LLM Scheduling by Learning to Rank

Add code
Aug 28, 2024
Figure 1 for Efficient LLM Scheduling by Learning to Rank
Figure 2 for Efficient LLM Scheduling by Learning to Rank
Figure 3 for Efficient LLM Scheduling by Learning to Rank
Figure 4 for Efficient LLM Scheduling by Learning to Rank
Viaarxiv icon

Post-Training Sparse Attention with Double Sparsity

Add code
Aug 11, 2024
Viaarxiv icon

MPC-Minimized Secure LLM Inference

Add code
Aug 07, 2024
Viaarxiv icon

Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design

Add code
Jul 23, 2024
Figure 1 for Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design
Figure 2 for Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design
Figure 3 for Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design
Figure 4 for Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design
Viaarxiv icon