Picture for Shiwei Zhang

Shiwei Zhang

MSAVBench: Towards Comprehensive and Reliable Evaluation of Multi-Shot Audio-Video Generation

Add code
May 19, 2026
Viaarxiv icon

DiffusionOPD: A Unified Perspective of On-Policy Distillation in Diffusion Models

Add code
May 14, 2026
Viaarxiv icon

Bridging Brain and Semantics: A Hierarchical Framework for Semantically Enhanced fMRI-to-Video Reconstruction

Add code
May 14, 2026
Viaarxiv icon

M$^\star$: Every Task Deserves Its Own Memory Harness

Add code
Apr 10, 2026
Viaarxiv icon

AIBench: Evaluating Visual-Logical Consistency in Academic Illustration Generation

Add code
Mar 31, 2026
Viaarxiv icon

DreamVideo-Omni: Omni-Motion Controlled Multi-Subject Video Customization with Latent Identity Reinforcement Learning

Add code
Mar 12, 2026
Viaarxiv icon

ReViSE: Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning

Add code
Dec 11, 2025
Figure 1 for ReViSE: Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning
Figure 2 for ReViSE: Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning
Figure 3 for ReViSE: Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning
Figure 4 for ReViSE: Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning
Viaarxiv icon

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Add code
Dec 09, 2025
Viaarxiv icon

Self-Contradiction as Self-Improvement: Mitigating the Generation-Understanding Gap in MLLMs

Add code
Jul 22, 2025
Figure 1 for Self-Contradiction as Self-Improvement: Mitigating the Generation-Understanding Gap in MLLMs
Figure 2 for Self-Contradiction as Self-Improvement: Mitigating the Generation-Understanding Gap in MLLMs
Figure 3 for Self-Contradiction as Self-Improvement: Mitigating the Generation-Understanding Gap in MLLMs
Figure 4 for Self-Contradiction as Self-Improvement: Mitigating the Generation-Understanding Gap in MLLMs
Viaarxiv icon

UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer

Add code
Apr 15, 2025
Figure 1 for UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
Figure 2 for UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
Figure 3 for UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
Viaarxiv icon