Picture for Zhiyuan Qin

Zhiyuan Qin

BiManiBench: A Hierarchical Benchmark for Evaluating Bimanual Coordination of Multimodal Large Language Models

Add code
Feb 09, 2026
Viaarxiv icon

TC-IDM: Grounding Video Generation for Executable Zero-shot Robot Motion

Add code
Jan 26, 2026
Viaarxiv icon

Wow, wo, val! A Comprehensive Embodied World Model Evaluation Turing Test

Add code
Jan 07, 2026
Viaarxiv icon

WristWorld: Generating Wrist-Views via 4D World Models for Robotic Manipulation

Add code
Oct 08, 2025
Viaarxiv icon

WoW: Towards a World omniscient World model Through Embodied Interaction

Add code
Sep 26, 2025
Viaarxiv icon

Follow-Your-Instruction: A Comprehensive MLLM Agent for World Data Synthesis

Add code
Aug 07, 2025
Viaarxiv icon

EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks

Add code
Mar 14, 2025
Figure 1 for EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks
Figure 2 for EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks
Figure 3 for EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks
Figure 4 for EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks
Viaarxiv icon