Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:ObPose: Leveraging Canonical Pose for Object-Centric Scene Inference in 3D

Jun 07, 2022

Yizhe Wu, Oiwi Parker Jones, Ingmar Posner

Figure 1 for ObPose: Leveraging Canonical Pose for Object-Centric Scene Inference in 3D

Figure 2 for ObPose: Leveraging Canonical Pose for Object-Centric Scene Inference in 3D

Figure 3 for ObPose: Leveraging Canonical Pose for Object-Centric Scene Inference in 3D

Figure 4 for ObPose: Leveraging Canonical Pose for Object-Centric Scene Inference in 3D

Share this with someone who'll enjoy it:

Abstract:We present ObPose, an unsupervised object-centric generative model that learns to segment 3D objects from RGB-D video in an unsupervised manner. Inspired by prior art in 2D representation learning, ObPose considers a factorised latent space, separately encoding object-wise location (where) and appearance (what) information. In particular, ObPose leverages an object's canonical pose, defined via a minimum volume principle, as a novel inductive bias for learning the where component. To achieve this, we propose an efficient, voxelised approximation approach to recover the object shape directly from a neural radiance field (NeRF). As a consequence, ObPose models scenes as compositions of NeRFs representing individual objects. When evaluated on the YCB dataset for unsupervised scene segmentation, ObPose outperforms the current state-of-the-art in 3D scene inference (ObSuRF) by a significant margin in terms of segmentation quality for both video inputs as well as for multi-view static scenes. In addition, the design choices made in the ObPose encoder are validated with relevant ablations.

* 16 pages, 6 figures

View paper on

Share this with someone who'll enjoy it:

Title:ObPose: Leveraging Canonical Pose for Object-Centric Scene Inference in 3D

Paper and Code