Picture for Yujun Lin

Yujun Lin

Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization

Add code
Feb 03, 2026
Viaarxiv icon

Jet-RL: Enabling On-Policy FP8 Reinforcement Learning with Unified Training and Rollout Precision Flow

Add code
Jan 20, 2026
Viaarxiv icon

Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation

Add code
Jun 24, 2025
Viaarxiv icon

Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation

Add code
May 24, 2025
Viaarxiv icon

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

Add code
Feb 20, 2025
Viaarxiv icon

Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity

Add code
Feb 03, 2025
Figure 1 for Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Figure 2 for Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Figure 3 for Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Figure 4 for Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Viaarxiv icon

SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer

Add code
Jan 30, 2025
Figure 1 for SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Figure 2 for SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Figure 3 for SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Figure 4 for SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Viaarxiv icon

SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Add code
Nov 07, 2024
Figure 1 for SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Figure 2 for SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Figure 3 for SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Figure 4 for SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Viaarxiv icon

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers

Add code
Oct 15, 2024
Figure 1 for SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Figure 2 for SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Figure 3 for SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Figure 4 for SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Viaarxiv icon

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Add code
May 07, 2024
Figure 1 for QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Figure 2 for QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Figure 3 for QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Figure 4 for QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Viaarxiv icon