Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding

Aug 20, 2024

Youjun Zhao, Jiaying Lin, Shuquan Ye, Qianshi Pang, Rynson W. H. Lau

Figure 1 for OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding

Figure 2 for OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding

Figure 3 for OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding

Figure 4 for OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding

Share this with someone who'll enjoy it:

Abstract:Open-vocabulary 3D scene understanding (OV-3D) aims to localize and classify novel objects beyond the closed object classes. However, existing approaches and benchmarks primarily focus on the open vocabulary problem within the context of object classes, which is insufficient to provide a holistic evaluation to what extent a model understands the 3D scene. In this paper, we introduce a more challenging task called Generalized Open-Vocabulary 3D Scene Understanding (GOV-3D) to explore the open vocabulary problem beyond object classes. It encompasses an open and diverse set of generalized knowledge, expressed as linguistic queries of fine-grained and object-specific attributes. To this end, we contribute a new benchmark named OpenScan, which consists of 3D object attributes across eight representative linguistic aspects, including affordance, property, material, and more. We further evaluate state-of-the-art OV-3D methods on our OpenScan benchmark, and discover that these methods struggle to comprehend the abstract vocabularies of the GOV-3D task, a challenge that cannot be addressed by simply scaling up object classes during training. We highlight the limitations of existing methodologies and explore a promising direction to overcome the identified shortcomings. Data and code are available at https://github.com/YoujunZhao/OpenScan

View paper on

Share this with someone who'll enjoy it:

Title:OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding

Paper and Code