Picture for Cheng Yu

Cheng Yu

The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

Add code
Jan 21, 2026
Viaarxiv icon

Unified Thinker: A General Reasoning Modular Core for Image Generation

Add code
Jan 06, 2026
Viaarxiv icon

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Add code
Dec 31, 2025
Viaarxiv icon

Understanding Diffusion Models via Code Execution

Add code
Dec 08, 2025
Viaarxiv icon

Language Model Based Text-to-Audio Generation: Anti-Causally Aligned Collaborative Residual Transformers

Add code
Oct 06, 2025
Viaarxiv icon

Synchronized Video-to-Audio Generation via Mel Quantization-Continuum Decomposition

Add code
Mar 10, 2025
Viaarxiv icon

Accelerating Vision Diffusion Transformers with Skip Branches

Add code
Nov 26, 2024
Figure 1 for Accelerating Vision Diffusion Transformers with Skip Branches
Figure 2 for Accelerating Vision Diffusion Transformers with Skip Branches
Figure 3 for Accelerating Vision Diffusion Transformers with Skip Branches
Figure 4 for Accelerating Vision Diffusion Transformers with Skip Branches
Viaarxiv icon

FaceChain-FACT: Face Adapter with Decoupled Training for Identity-preserved Personalization

Add code
Oct 16, 2024
Figure 1 for FaceChain-FACT: Face Adapter with Decoupled Training for Identity-preserved Personalization
Figure 2 for FaceChain-FACT: Face Adapter with Decoupled Training for Identity-preserved Personalization
Figure 3 for FaceChain-FACT: Face Adapter with Decoupled Training for Identity-preserved Personalization
Figure 4 for FaceChain-FACT: Face Adapter with Decoupled Training for Identity-preserved Personalization
Viaarxiv icon

Dynamic Depth Decoding: Faster Speculative Decoding for LLMs

Add code
Aug 30, 2024
Figure 1 for Dynamic Depth Decoding: Faster Speculative Decoding for LLMs
Figure 2 for Dynamic Depth Decoding: Faster Speculative Decoding for LLMs
Figure 3 for Dynamic Depth Decoding: Faster Speculative Decoding for LLMs
Figure 4 for Dynamic Depth Decoding: Faster Speculative Decoding for LLMs
Viaarxiv icon

AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling

Add code
Jun 17, 2024
Figure 1 for AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling
Figure 2 for AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling
Figure 3 for AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling
Figure 4 for AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling
Viaarxiv icon