Picture for Zhou Zhao

Zhou Zhao

Dual-Axis Generative Reward Model Toward Semantic and Turn-taking Robustness in Interactive Spoken Dialogue Models

Add code
Apr 16, 2026
Viaarxiv icon

WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training

Add code
Apr 16, 2026
Viaarxiv icon

Character Beyond Speech: Leveraging Role-Playing Evaluation in Audio Large Language Models via Reinforcement Learning

Add code
Apr 15, 2026
Viaarxiv icon

From Perception to Planning: Evolving Ego-Centric Task-Oriented Spatiotemporal Reasoning via Curriculum Learning

Add code
Apr 12, 2026
Viaarxiv icon

A Progressive Training Strategy for Vision-Language Models to Counteract Spatio-Temporal Hallucinations in Embodied Reasoning

Add code
Apr 12, 2026
Viaarxiv icon

ImVideoEdit: Image-learning Video Editing via 2D Spatial Difference Attention Blocks

Add code
Apr 09, 2026
Viaarxiv icon

Chat-Scene++: Exploiting Context-Rich Object Identification for 3D LLM

Add code
Mar 29, 2026
Viaarxiv icon

CIAR: Interval-based Collaborative Decoding for Image Generation Acceleration

Add code
Mar 26, 2026
Viaarxiv icon

SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation

Add code
Mar 23, 2026
Viaarxiv icon

Modeling and Benchmarking Spoken Dialogue Rewards with Modality and Colloquialness

Add code
Mar 16, 2026
Viaarxiv icon