Picture for Yan Huang

Yan Huang

Improving Vision-Language-Action Model Fine-Tuning with Structured Stage and Keyframe Supervision

Add code
Jun 25, 2026
Viaarxiv icon

E-TTS: A New Embodied Test-Time Scaling Framework for Robotic Manipulation

Add code
Jun 25, 2026
Viaarxiv icon

Decentralized Pose Graph Riemannian Optimization for Object-based Multi-Robot SLAM

Add code
Jun 23, 2026
Viaarxiv icon

Efficient-WAM: A 1B-Parameter World-Action Model with Low-Cost Future Imagination

Add code
Jun 08, 2026
Viaarxiv icon

WAM-Nav: Asymmetric Latent World-Action Modeling for Unified Visual Navigation

Add code
Jun 03, 2026
Viaarxiv icon

When Seeing Is Not Believing -- A Benchmark for Search-Grounded Video Misinformation Detection

Add code
Jun 02, 2026
Viaarxiv icon

SKIP: Sparse Keyframe Interpolation Paradigm for Efficient Embodied World Models

Add code
May 30, 2026
Viaarxiv icon

PanopticQuery: Unified Query-Time Reasoning for 4D Scenes

Add code
Apr 07, 2026
Viaarxiv icon

Multi-View Video Diffusion Policy: A 3D Spatio-Temporal-Aware Video Action Model

Add code
Apr 03, 2026
Viaarxiv icon

FloorPlan-VLN: A New Paradigm for Floor Plan Guided Vision-Language Navigation

Add code
Mar 18, 2026
Viaarxiv icon