Picture for Ion Stoica

Ion Stoica

VisionArena: 230K Real World User-VLM Conversations with Preference Labels

Add code
Dec 11, 2024
Viaarxiv icon

GameArena: Evaluating LLM Reasoning through Live Computer Games

Add code
Dec 09, 2024
Viaarxiv icon

FogROS2-FT: Fault Tolerant Cloud Robotics

Add code
Dec 06, 2024
Viaarxiv icon

BlendServe: Optimizing Offline Inference for Auto-regressive Large Models with Resource-aware Batching

Add code
Nov 25, 2024
Viaarxiv icon

MoE-Lightning: High-Throughput MoE Inference on Memory-constrained GPUs

Add code
Nov 18, 2024
Viaarxiv icon

Pie: Pooling CPU Memory for LLM Inference

Add code
Nov 14, 2024
Figure 1 for Pie: Pooling CPU Memory for LLM Inference
Figure 2 for Pie: Pooling CPU Memory for LLM Inference
Figure 3 for Pie: Pooling CPU Memory for LLM Inference
Figure 4 for Pie: Pooling CPU Memory for LLM Inference
Viaarxiv icon

SkyServe: Serving AI Models across Regions and Clouds with Spot Instances

Add code
Nov 03, 2024
Viaarxiv icon

NEO: Saving GPU Memory Crisis with CPU Offloading for Online LLM Inference

Add code
Nov 02, 2024
Figure 1 for NEO: Saving GPU Memory Crisis with CPU Offloading for Online LLM Inference
Figure 2 for NEO: Saving GPU Memory Crisis with CPU Offloading for Online LLM Inference
Figure 3 for NEO: Saving GPU Memory Crisis with CPU Offloading for Online LLM Inference
Figure 4 for NEO: Saving GPU Memory Crisis with CPU Offloading for Online LLM Inference
Viaarxiv icon

Managing Bandwidth: The Key to Cloud-Assisted Autonomous Driving

Add code
Oct 21, 2024
Viaarxiv icon

How to Evaluate Reward Models for RLHF

Add code
Oct 18, 2024
Figure 1 for How to Evaluate Reward Models for RLHF
Figure 2 for How to Evaluate Reward Models for RLHF
Figure 3 for How to Evaluate Reward Models for RLHF
Figure 4 for How to Evaluate Reward Models for RLHF
Viaarxiv icon