Picture for Zihao Chen

Zihao Chen

for the ALFA study

Disentangling Granularity: An Implicit Inductive Bias in Factorized VAEs

Add code
May 30, 2025
Viaarxiv icon

Towards Video to Piano Music Generation with Chain-of-Perform Support Benchmarks

Add code
May 26, 2025
Viaarxiv icon

MM-MovieDubber: Towards Multi-Modal Learning for Multi-Modal Movie Dubbing

Add code
May 22, 2025
Viaarxiv icon

IIKL: Isometric Immersion Kernel Learning with Riemannian Manifold for Geometric Preservation

Add code
May 07, 2025
Viaarxiv icon

Towards Film-Making Production Dialogue, Narration, Monologue Adaptive Moving Dubbing Benchmarks

Add code
Apr 30, 2025
Viaarxiv icon

DeepDubber-V1: Towards High Quality and Dialogue, Narration, Monologue Adaptive Movie Dubbing Via Multi-Modal Chain-of-Thoughts Reasoning Guidance

Add code
Mar 31, 2025
Viaarxiv icon

DeepAudio-V1:Towards Multi-Modal Multi-Stage End-to-End Video to Speech and Audio Generation

Add code
Mar 28, 2025
Viaarxiv icon

Enhance Generation Quality of Flow Matching V2A Model via Multi-Step CoT-Like Guidance and Combined Preference Optimization

Add code
Mar 28, 2025
Viaarxiv icon

DeepSound-V1: Start to Think Step-by-Step in the Audio Generation from Videos

Add code
Mar 28, 2025
Viaarxiv icon

ATARS: An Aerial Traffic Atomic Activity Recognition and Temporal Segmentation Dataset

Add code
Mar 24, 2025
Viaarxiv icon