Picture for Zili Zhang

Zili Zhang

UltraEP: Unleash MoE Training and Inference on Rack-Scale Nodes with Near-Optimal Load Balancing

Add code
Jun 02, 2026
Viaarxiv icon

BigMac: Breaking the Pareto Frontier of Compute and Memory in Multimodal LLM Training

Add code
May 25, 2026
Viaarxiv icon

Heddle: A Distributed Orchestration System for Agentic RL Rollout

Add code
Mar 30, 2026
Viaarxiv icon

TokenLake: A Unified Segment-level Prefix Cache Pool for Fine-grained Elastic Long-Context LLM Serving

Add code
Aug 24, 2025
Figure 1 for TokenLake: A Unified Segment-level Prefix Cache Pool for Fine-grained Elastic Long-Context LLM Serving
Figure 2 for TokenLake: A Unified Segment-level Prefix Cache Pool for Fine-grained Elastic Long-Context LLM Serving
Figure 3 for TokenLake: A Unified Segment-level Prefix Cache Pool for Fine-grained Elastic Long-Context LLM Serving
Figure 4 for TokenLake: A Unified Segment-level Prefix Cache Pool for Fine-grained Elastic Long-Context LLM Serving
Viaarxiv icon

Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model

Add code
Jun 10, 2025
Figure 1 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 2 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 3 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 4 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Viaarxiv icon

Label-efficient Single Photon Images Classification via Active Learning

Add code
May 07, 2025
Viaarxiv icon

StreamRL: Scalable, Heterogeneous, and Elastic RL for LLMs with Disaggregated Stream Generation

Add code
Apr 22, 2025
Viaarxiv icon

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Add code
Feb 18, 2025
Viaarxiv icon

RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation

Add code
Apr 18, 2024
Figure 1 for RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation
Figure 2 for RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation
Figure 3 for RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation
Figure 4 for RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation
Viaarxiv icon

Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback

Add code
Apr 02, 2024
Figure 1 for Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback
Figure 2 for Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback
Figure 3 for Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback
Figure 4 for Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback
Viaarxiv icon