Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:OpenScene: 3D Scene Understanding with Open Vocabularies

Nov 28, 2022

Songyou Peng, Kyle Genova, Chiyu "Max" Jiang, Andrea Tagliasacchi, Marc Pollefeys, Thomas Funkhouser

Figure 1 for OpenScene: 3D Scene Understanding with Open Vocabularies

Figure 2 for OpenScene: 3D Scene Understanding with Open Vocabularies

Figure 3 for OpenScene: 3D Scene Understanding with Open Vocabularies

Figure 4 for OpenScene: 3D Scene Understanding with Open Vocabularies

Share this with someone who'll enjoy it:

Abstract:Traditional 3D scene understanding approaches rely on labeled 3D datasets to train a model for a single task with supervision. We propose OpenScene, an alternative approach where a model predicts dense features for 3D scene points that are co-embedded with text and image pixels in CLIP feature space. This zero-shot approach enables task-agnostic training and open-vocabulary queries. For example, to perform SOTA zero-shot 3D semantic segmentation it first infers CLIP features for every 3D point and later classifies them based on similarities to embeddings of arbitrary class labels. More interestingly, it enables a suite of open-vocabulary scene understanding applications that have never been done before. For example, it allows a user to enter an arbitrary text query and then see a heat map indicating which parts of a scene match. Our approach is effective at identifying objects, materials, affordances, activities, and room types in complex 3D scenes, all using a single model trained without any labeled 3D data.

* Project page: https://pengsongyou.github.io/openscene

View paper on

Share this with someone who'll enjoy it:

Title:OpenScene: 3D Scene Understanding with Open Vocabularies

Paper and Code