Picture for Zhuoming Chen

Zhuoming Chen

Sparrow: Sparse Rollout for Stable and Efficient Long-context RL of Large Language Models

Add code
Jun 07, 2026
Viaarxiv icon

Vortex: Efficient and Programmable Sparse Attention Serving for AI Agents

Add code
Jun 04, 2026
Viaarxiv icon

MonarchRT: Efficient Attention for Real-Time Video Generation

Add code
Feb 12, 2026
Viaarxiv icon

Jackpot: Optimal Budgeted Rejection Sampling for Extreme Actor-Policy Mismatch Reinforcement Learning

Add code
Feb 05, 2026
Viaarxiv icon

Kinetics: Rethinking Test-Time Scaling Laws

Add code
Jun 06, 2025
Viaarxiv icon

GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?

Add code
Feb 07, 2025
Viaarxiv icon

AdaServe: SLO-Customized LLM Serving with Fine-Grained Speculative Decoding

Add code
Jan 21, 2025
Viaarxiv icon

MagicPIG: LSH Sampling for Efficient LLM Generation

Add code
Oct 21, 2024
Figure 1 for MagicPIG: LSH Sampling for Efficient LLM Generation
Figure 2 for MagicPIG: LSH Sampling for Efficient LLM Generation
Figure 3 for MagicPIG: LSH Sampling for Efficient LLM Generation
Figure 4 for MagicPIG: LSH Sampling for Efficient LLM Generation
Viaarxiv icon

Sirius: Contextual Sparsity with Correction for Efficient LLMs

Add code
Sep 05, 2024
Figure 1 for Sirius: Contextual Sparsity with Correction for Efficient LLMs
Figure 2 for Sirius: Contextual Sparsity with Correction for Efficient LLMs
Figure 3 for Sirius: Contextual Sparsity with Correction for Efficient LLMs
Figure 4 for Sirius: Contextual Sparsity with Correction for Efficient LLMs
Viaarxiv icon

MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding

Add code
Aug 21, 2024
Figure 1 for MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
Figure 2 for MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
Figure 3 for MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
Figure 4 for MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
Viaarxiv icon