Picture for Zhou Zhao

Zhou Zhao

Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches

Add code
Jan 20, 2026
Viaarxiv icon

Distribution-Centric Policy Optimization Dominates Exploration-Exploitation Trade-off

Add code
Jan 19, 2026
Viaarxiv icon

Unified Thinker: A General Reasoning Modular Core for Image Generation

Add code
Jan 06, 2026
Viaarxiv icon

Generative Reasoning Recommendation via LLMs

Add code
Oct 23, 2025
Viaarxiv icon

DSI-Bench: A Benchmark for Dynamic Spatial Intelligence

Add code
Oct 21, 2025
Viaarxiv icon

SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer

Add code
Sep 04, 2025
Figure 1 for SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer
Figure 2 for SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer
Figure 3 for SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer
Figure 4 for SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer
Viaarxiv icon

OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use

Add code
Aug 06, 2025
Figure 1 for OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Figure 2 for OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Figure 3 for OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Figure 4 for OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Viaarxiv icon

EC-Diff: Fast and High-Quality Edge-Cloud Collaborative Inference for Diffusion Models

Add code
Jul 16, 2025
Viaarxiv icon

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation

Add code
Jul 09, 2025
Viaarxiv icon

ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing

Add code
Jun 26, 2025
Figure 1 for ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
Figure 2 for ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
Figure 3 for ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
Figure 4 for ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
Viaarxiv icon