Picture for Zehan Wang

Zehan Wang

WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling

Add code
Dec 16, 2025
Viaarxiv icon

DSI-Bench: A Benchmark for Dynamic Spatial Intelligence

Add code
Oct 21, 2025
Viaarxiv icon

GenSpace: Benchmarking Spatially-Aware Image Generation

Add code
May 30, 2025
Figure 1 for GenSpace: Benchmarking Spatially-Aware Image Generation
Figure 2 for GenSpace: Benchmarking Spatially-Aware Image Generation
Figure 3 for GenSpace: Benchmarking Spatially-Aware Image Generation
Figure 4 for GenSpace: Benchmarking Spatially-Aware Image Generation
Viaarxiv icon

T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback

Add code
May 15, 2025
Figure 1 for T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback
Figure 2 for T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback
Figure 3 for T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback
Figure 4 for T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback
Viaarxiv icon

Depth Anything with Any Prior

Add code
May 15, 2025
Viaarxiv icon

RoboGround: Robotic Manipulation with Grounded Vision-Language Priors

Add code
Apr 30, 2025
Figure 1 for RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
Figure 2 for RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
Figure 3 for RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
Figure 4 for RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
Viaarxiv icon

Diff-Prompt: Diffusion-Driven Prompt Generator with Mask Supervision

Add code
Apr 30, 2025
Viaarxiv icon

Unleashing the Power of Natural Audio Featuring Multiple Sound Sources

Add code
Apr 24, 2025
Viaarxiv icon

EAGER-LLM: Enhancing Large Language Models as Recommenders through Exogenous Behavior-Semantic Integration

Add code
Feb 20, 2025
Viaarxiv icon

OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios

Add code
Jan 02, 2025
Figure 1 for OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios
Figure 2 for OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios
Figure 3 for OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios
Figure 4 for OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios
Viaarxiv icon