Picture for Congjie He

Congjie He

WaferLLM: A Wafer-Scale LLM Inference System

Add code
Feb 06, 2025
Viaarxiv icon

MoE-CAP: Cost-Accuracy-Performance Benchmarking for Mixture-of-Experts Systems

Add code
Dec 10, 2024
Viaarxiv icon

GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models

Add code
Oct 08, 2023
Viaarxiv icon

Quiver: Supporting GPUs for Low-Latency, High-Throughput GNN Serving with Workload Awareness

Add code
May 18, 2023
Viaarxiv icon