Picture for Luo Mai

Luo Mai

MoE-CAP: Cost-Accuracy-Performance Benchmarking for Mixture-of-Experts Systems

Add code
Dec 10, 2024
Viaarxiv icon

PH-Dropout: Practical Epistemic Uncertainty Quantification for View Synthesis

Add code
Oct 11, 2024
Viaarxiv icon

PH-Dropout: Prctical Epistemic Uncertainty Quantification for View Synthesis

Add code
Oct 07, 2024
Viaarxiv icon

Learning High-Frequency Functions Made Easy with Sinusoidal Positional Encoding

Add code
Jul 12, 2024
Viaarxiv icon

MoE-Infinity: Activation-Aware Expert Offloading for Efficient MoE Serving

Add code
Jan 25, 2024
Viaarxiv icon

ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models

Add code
Jan 25, 2024
Viaarxiv icon

TENPLEX: Changing Resources of Deep Learning Jobs using Parallelizable Tensor Collections

Add code
Dec 08, 2023
Viaarxiv icon

GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models

Add code
Oct 08, 2023
Viaarxiv icon

Large Sequence Models for Sequential Decision-Making: A Survey

Add code
Jun 24, 2023
Viaarxiv icon

Quiver: Supporting GPUs for Low-Latency, High-Throughput GNN Serving with Workload Awareness

Add code
May 18, 2023
Viaarxiv icon