Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects

Jul 14, 2024

Weiming Zhi, Haozhan Tang, Tianyi Zhang, Matthew Johnson-Roberson

Figure 1 for 3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects

Figure 2 for 3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects

Figure 3 for 3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects

Figure 4 for 3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects

Share this with someone who'll enjoy it:

Abstract:Humans have the remarkable ability to use held objects as tools to interact with their environment. For this to occur, humans internally estimate how hand movements affect the object's movement. We wish to endow robots with this capability. We contribute methodology to jointly estimate the geometry and pose of objects grasped by a robot, from RGB images captured by an external camera. Notably, our method transforms the estimated geometry into the robot's coordinate frame, while not requiring the extrinsic parameters of the external camera to be calibrated. Our approach leverages 3D foundation models, large models pre-trained on huge datasets for 3D vision tasks, to produce initial estimates of the in-hand object. These initial estimations do not have physically correct scales and are in the camera's frame. Then, we formulate, and efficiently solve, a coordinate-alignment problem to recover accurate scales, along with a transformation of the objects to the coordinate frame of the robot. Forward kinematics mappings can subsequently be defined from the manipulator's joint angles to specified points on the object. These mappings enable the estimation of points on the held object at arbitrary configurations, enabling robot motion to be designed with respect to coordinates on the grasped objects. We empirically evaluate our approach on a robot manipulator holding a diverse set of real-world objects.

View paper on

Share this with someone who'll enjoy it:

Title:3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects

Paper and Code