Picture for Cheng Yu

Cheng Yu

DDA-Thinker: Decoupled Dual-Atomic Reinforcement Learning for Reasoning-Driven Image Editing

Add code
Apr 28, 2026
Viaarxiv icon

AdaSpark: Adaptive Sparsity for Efficient Long-Video Understanding

Add code
Apr 09, 2026
Viaarxiv icon

SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation

Add code
Mar 23, 2026
Viaarxiv icon

The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

Add code
Jan 21, 2026
Viaarxiv icon

Unified Thinker: A General Reasoning Modular Core for Image Generation

Add code
Jan 06, 2026
Viaarxiv icon

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Add code
Dec 31, 2025
Viaarxiv icon

Understanding Diffusion Models via Code Execution

Add code
Dec 08, 2025
Viaarxiv icon

Language Model Based Text-to-Audio Generation: Anti-Causally Aligned Collaborative Residual Transformers

Add code
Oct 06, 2025
Viaarxiv icon

Synchronized Video-to-Audio Generation via Mel Quantization-Continuum Decomposition

Add code
Mar 10, 2025
Viaarxiv icon

Accelerating Vision Diffusion Transformers with Skip Branches

Add code
Nov 26, 2024
Figure 1 for Accelerating Vision Diffusion Transformers with Skip Branches
Figure 2 for Accelerating Vision Diffusion Transformers with Skip Branches
Figure 3 for Accelerating Vision Diffusion Transformers with Skip Branches
Figure 4 for Accelerating Vision Diffusion Transformers with Skip Branches
Viaarxiv icon