Picture for Yufan Deng

Yufan Deng

MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head

Add code
Jan 12, 2026
Viaarxiv icon

Focal Guidance: Unlocking Controllability from Semantic-Weak Layers in Video Diffusion Models

Add code
Jan 12, 2026
Viaarxiv icon

MTPNet: Multi-Grained Target Perception for Unified Activity Cliff Prediction

Add code
Jun 05, 2025
Viaarxiv icon

MAGREF: Masked Guidance for Any-Reference Video Generation

Add code
May 29, 2025
Viaarxiv icon

OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation

Add code
May 28, 2025
Viaarxiv icon

Anymate: A Dataset and Baselines for Learning 3D Object Rigging

Add code
May 09, 2025
Viaarxiv icon

MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation

Add code
Mar 18, 2025
Viaarxiv icon

CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance

Add code
Mar 13, 2025
Viaarxiv icon

VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model

Add code
Jan 21, 2025
Figure 1 for VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model
Figure 2 for VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model
Figure 3 for VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model
Figure 4 for VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model
Viaarxiv icon

VideoTetris: Towards Compositional Text-to-Video Generation

Add code
Jun 06, 2024
Figure 1 for VideoTetris: Towards Compositional Text-to-Video Generation
Figure 2 for VideoTetris: Towards Compositional Text-to-Video Generation
Figure 3 for VideoTetris: Towards Compositional Text-to-Video Generation
Figure 4 for VideoTetris: Towards Compositional Text-to-Video Generation
Viaarxiv icon