Picture for Hao Zhao

Hao Zhao

CubeBench: Diagnosing Interactive, Long-Horizon Spatial Reasoning Under Partial Observations

Add code
Dec 30, 2025
Viaarxiv icon

Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

Add code
Dec 29, 2025
Viaarxiv icon

Native and Compact Structured Latents for 3D Generation

Add code
Dec 16, 2025
Viaarxiv icon

HyPerNav: Hybrid Perception for Object-Oriented Navigation in Unknown Environment

Add code
Oct 27, 2025
Figure 1 for HyPerNav: Hybrid Perception for Object-Oriented Navigation in Unknown Environment
Figure 2 for HyPerNav: Hybrid Perception for Object-Oriented Navigation in Unknown Environment
Figure 3 for HyPerNav: Hybrid Perception for Object-Oriented Navigation in Unknown Environment
Figure 4 for HyPerNav: Hybrid Perception for Object-Oriented Navigation in Unknown Environment
Viaarxiv icon

Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method

Add code
Oct 27, 2025
Figure 1 for Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method
Figure 2 for Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method
Figure 3 for Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method
Figure 4 for Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method
Viaarxiv icon

FlowDrive: Energy Flow Field for End-to-End Autonomous Driving

Add code
Sep 17, 2025
Viaarxiv icon

StereoCarla: A High-Fidelity Driving Dataset for Generalizable Stereo

Add code
Sep 16, 2025
Viaarxiv icon

RoboChemist: Long-Horizon and Safety-Compliant Robotic Chemical Experimentation

Add code
Sep 10, 2025
Figure 1 for RoboChemist: Long-Horizon and Safety-Compliant Robotic Chemical Experimentation
Figure 2 for RoboChemist: Long-Horizon and Safety-Compliant Robotic Chemical Experimentation
Figure 3 for RoboChemist: Long-Horizon and Safety-Compliant Robotic Chemical Experimentation
Figure 4 for RoboChemist: Long-Horizon and Safety-Compliant Robotic Chemical Experimentation
Viaarxiv icon

TA-VLA: Elucidating the Design Space of Torque-aware Vision-Language-Action Models

Add code
Sep 09, 2025
Viaarxiv icon

One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation

Add code
Sep 09, 2025
Figure 1 for One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation
Figure 2 for One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation
Figure 3 for One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation
Figure 4 for One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation
Viaarxiv icon