Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Deep Gated Multi-modal Learning: In-hand Object Pose Estimation with Tactile and Image

Sep 27, 2019

Tomoki Anzai, Kuniyuki Takahashi

Figure 1 for Deep Gated Multi-modal Learning: In-hand Object Pose Estimation with Tactile and Image

Figure 2 for Deep Gated Multi-modal Learning: In-hand Object Pose Estimation with Tactile and Image

Figure 3 for Deep Gated Multi-modal Learning: In-hand Object Pose Estimation with Tactile and Image

Figure 4 for Deep Gated Multi-modal Learning: In-hand Object Pose Estimation with Tactile and Image

Share this with someone who'll enjoy it:

Abstract:In robot manipulation tasks, especially in-hand manipulation, estimation of the position and orientation of an object is an essential skill to manipulate objects freely. However, since in-hand manipulation tends to cause occlusion by the hand itself, image information only is not sufficient. For the challenge, combining tactile sensors is one of the approaches. The advantage of using multiple sensors (modals) is that the other modals can compensate for occlusion, noise, and sensor malfunctions. Even though the decision making of each modal reliability corresponding to the situations is important, the manual design of the model is difficult to deal with various situations. Therefore, in this study, we propose deep gated multi-modal learning using end-to-end deep learning in which the network self-determines the reliability of each modal. As experiments, an RGB camera and a GelSight tactile sensor were attached to the gripper of the Sawyer robot, and the poses were estimated during grasping. A total of 15 objects were used in the experiments. In the proposed model, the reliability of the modal was determined according to the noise and failure of each modal, and it was confirmed that the pose was estimated even for unknown objects.

* 7 pages. An accompanying video is available at https://youtu.be/NhS_ZhgADGQ

View paper on

Share this with someone who'll enjoy it:

Title:Deep Gated Multi-modal Learning: In-hand Object Pose Estimation with Tactile and Image

Paper and Code