Picture for Zhou Zhao

Zhou Zhao

ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing

Add code
Jun 26, 2025
Viaarxiv icon

IRBridge: Solving Image Restoration Bridge with Pre-trained Generative Diffusion Models

Add code
May 30, 2025
Viaarxiv icon

GenSpace: Benchmarking Spatially-Aware Image Generation

Add code
May 30, 2025
Viaarxiv icon

TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis

Add code
May 20, 2025
Viaarxiv icon

T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback

Add code
May 15, 2025
Viaarxiv icon

Depth Anything with Any Prior

Add code
May 15, 2025
Viaarxiv icon

WavReward: Spoken Dialogue Models With Generalist Reward Evaluators

Add code
May 14, 2025
Viaarxiv icon

Rejoining fragmented ancient bamboo slips with physics-driven deep learning

Add code
May 13, 2025
Viaarxiv icon

LiftFeat: 3D Geometry-Aware Local Feature Matching

Add code
May 06, 2025
Viaarxiv icon

RoboGround: Robotic Manipulation with Grounded Vision-Language Priors

Add code
Apr 30, 2025
Viaarxiv icon