Picture for Minyi Guo

Minyi Guo

GF-DiT: Scheduling Parallelism for Diffusion Transformer Serving

Add code
Jun 11, 2026
Viaarxiv icon

polyDAG: Polynomial Acyclicity Constraints for Efficient Continuous Causal Discovery in Visual Semantic Graphs

Add code
Jun 05, 2026
Viaarxiv icon

On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference

Add code
May 06, 2026
Viaarxiv icon

CuBridge: An LLM-Based Framework for Understanding and Reconstructing High-Performance Attention Kernels

Add code
May 06, 2026
Viaarxiv icon

CoE: Collaborative Entropy for Uncertainty Quantification in Agentic Multi-LLM Systems

Add code
Mar 30, 2026
Viaarxiv icon

DASH: Deterministic Attention Scheduling for High-throughput Reproducible LLM Training

Add code
Jan 29, 2026
Viaarxiv icon

Yggdrasil: Bridging Dynamic Speculation and Static Runtime for Latency-Optimal Tree-Based LLM Decoding

Add code
Dec 29, 2025
Viaarxiv icon

Harli: SLO-Aware Co-location of LLM Inference and PEFT-based Finetuning on Model-as-a-Service Platforms

Add code
Nov 19, 2025
Viaarxiv icon

MoE-SpeQ: Speculative Quantized Decoding with Proactive Expert Prefetching and Offloading for Mixture-of-Experts

Add code
Nov 18, 2025
Viaarxiv icon

Boosting Embodied AI Agents through Perception-Generation Disaggregation and Asynchronous Pipeline Execution

Add code
Sep 11, 2025
Viaarxiv icon