Picture for Mingfei Han

Mingfei Han

Beyond Dense Futures: World Models as Structured Planners for Robotic Manipulation

Add code
Mar 13, 2026
Viaarxiv icon

GeoSense: Internalizing Geometric Necessity Perception for Multimodal Reasoning

Add code
Mar 11, 2026
Viaarxiv icon

See, Plan, Rewind: Progress-Aware Vision-Language-Action Models for Robust Robotic Manipulation

Add code
Mar 10, 2026
Viaarxiv icon

Implicit Geometry Representations for Vision-and-Language Navigation from Web Videos

Add code
Mar 10, 2026
Viaarxiv icon

Order from Chaos: Physical World Understanding from Glitchy Gameplay Videos

Add code
Jan 23, 2026
Viaarxiv icon

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Add code
Jan 08, 2026
Viaarxiv icon

CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal

Add code
Dec 22, 2025
Viaarxiv icon

BLAZER: Bootstrapping LLM-based Manipulation Agents with Zero-Shot Data Generation

Add code
Oct 09, 2025
Viaarxiv icon

PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly

Add code
Jun 10, 2025
Viaarxiv icon

CoNav: Collaborative Cross-Modal Reasoning for Embodied Navigation

Add code
May 22, 2025
Viaarxiv icon