Picture for Heung-Yeung Shum

Heung-Yeung Shum

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Add code
Jan 09, 2026
Viaarxiv icon

Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition

Add code
Dec 17, 2025
Viaarxiv icon

Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer

Add code
Aug 12, 2025
Figure 1 for Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer
Figure 2 for Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer
Figure 3 for Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer
Figure 4 for Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer
Viaarxiv icon

Step-Audio 2 Technical Report

Add code
Jul 24, 2025
Figure 1 for Step-Audio 2 Technical Report
Figure 2 for Step-Audio 2 Technical Report
Figure 3 for Step-Audio 2 Technical Report
Figure 4 for Step-Audio 2 Technical Report
Viaarxiv icon

Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model

Add code
Jun 10, 2025
Figure 1 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 2 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 3 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 4 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Viaarxiv icon

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Add code
Mar 31, 2025
Viaarxiv icon

Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

Add code
Mar 14, 2025
Viaarxiv icon

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Add code
Feb 18, 2025
Viaarxiv icon

Taming Teacher Forcing for Masked Autoregressive Video Generation

Add code
Jan 21, 2025
Figure 1 for Taming Teacher Forcing for Masked Autoregressive Video Generation
Figure 2 for Taming Teacher Forcing for Masked Autoregressive Video Generation
Figure 3 for Taming Teacher Forcing for Masked Autoregressive Video Generation
Figure 4 for Taming Teacher Forcing for Masked Autoregressive Video Generation
Viaarxiv icon

Multi-matrix Factorization Attention

Add code
Dec 26, 2024
Figure 1 for Multi-matrix Factorization Attention
Figure 2 for Multi-matrix Factorization Attention
Figure 3 for Multi-matrix Factorization Attention
Figure 4 for Multi-matrix Factorization Attention
Viaarxiv icon