Picture for Jiangmiao Pang

Jiangmiao Pang

Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation

Add code
Dec 19, 2024
Viaarxiv icon

Learning Humanoid Locomotion with Perceptive Internal Model

Add code
Nov 21, 2024
Viaarxiv icon

VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding

Add code
Oct 17, 2024
Viaarxiv icon

LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness

Add code
Sep 26, 2024
Figure 1 for LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
Figure 2 for LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
Figure 3 for LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
Figure 4 for LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
Viaarxiv icon

GRUtopia: Dream General Robots in a City at Scale

Add code
Jul 15, 2024
Viaarxiv icon

OVExp: Open Vocabulary Exploration for Object-Oriented Navigation

Add code
Jul 12, 2024
Figure 1 for OVExp: Open Vocabulary Exploration for Object-Oriented Navigation
Figure 2 for OVExp: Open Vocabulary Exploration for Object-Oriented Navigation
Figure 3 for OVExp: Open Vocabulary Exploration for Object-Oriented Navigation
Figure 4 for OVExp: Open Vocabulary Exploration for Object-Oriented Navigation
Viaarxiv icon

CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics

Add code
Jun 20, 2024
Figure 1 for CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics
Figure 2 for CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics
Figure 3 for CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics
Figure 4 for CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics
Viaarxiv icon

MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations

Add code
Jun 13, 2024
Figure 1 for MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations
Figure 2 for MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations
Figure 3 for MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations
Figure 4 for MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations
Viaarxiv icon

Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights

Add code
May 31, 2024
Viaarxiv icon

Grounded 3D-LLM with Referent Tokens

Add code
May 16, 2024
Viaarxiv icon