Picture for Jieyu Zhang

Jieyu Zhang

MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction

Add code
Jun 17, 2026
Viaarxiv icon

Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models

Add code
Jun 03, 2026
Viaarxiv icon

Demo-JEPA: Joint-Embedding Predictive Architecture for One-shot Cross-Embodiment Imitation

Add code
May 20, 2026
Viaarxiv icon

You Only Judge Once: Multi-response Reward Modeling in a Single Forward Pass

Add code
Apr 13, 2026
Viaarxiv icon

WildDet3D: Scaling Promptable 3D Detection in the Wild

Add code
Apr 09, 2026
Viaarxiv icon

MolmoPoint: Better Pointing for VLMs with Grounding Tokens

Add code
Mar 30, 2026
Viaarxiv icon

URDF-Anything+: Autoregressive Articulated 3D Models Generation for Physical Simulation

Add code
Mar 14, 2026
Viaarxiv icon

Video-Based Reward Modeling for Computer-Use Agents

Add code
Mar 10, 2026
Viaarxiv icon

TrajTok: Learning Trajectory Tokens enables better Video Understanding

Add code
Feb 26, 2026
Viaarxiv icon

Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?

Add code
Feb 04, 2026
Viaarxiv icon