Picture for Zhou Zhao

Zhou Zhao

TMD-Bench: A Multi-Level Evaluation Paradigm for Music-Dance Co-Generation

Add code
May 03, 2026
Viaarxiv icon

Diffusion Model as a Generalist Segmentation Learner

Add code
Apr 27, 2026
Viaarxiv icon

Bridging the Pose-Semantic Gap: A Cascade Framework for Text-Based Person Anomaly Search

Add code
Apr 25, 2026
Viaarxiv icon

Dual-Axis Generative Reward Model Toward Semantic and Turn-taking Robustness in Interactive Spoken Dialogue Models

Add code
Apr 16, 2026
Viaarxiv icon

WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training

Add code
Apr 16, 2026
Viaarxiv icon

Character Beyond Speech: Leveraging Role-Playing Evaluation in Audio Large Language Models via Reinforcement Learning

Add code
Apr 15, 2026
Viaarxiv icon

A Progressive Training Strategy for Vision-Language Models to Counteract Spatio-Temporal Hallucinations in Embodied Reasoning

Add code
Apr 12, 2026
Viaarxiv icon

From Perception to Planning: Evolving Ego-Centric Task-Oriented Spatiotemporal Reasoning via Curriculum Learning

Add code
Apr 12, 2026
Viaarxiv icon

ImVideoEdit: Image-learning Video Editing via 2D Spatial Difference Attention Blocks

Add code
Apr 09, 2026
Viaarxiv icon

Chat-Scene++: Exploiting Context-Rich Object Identification for 3D LLM

Add code
Mar 29, 2026
Viaarxiv icon