Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Nov 26, 2023

Zhihao Yuan, Jinke Ren, Chun-Mei Feng, Hengshuang Zhao, Shuguang Cui, Zhen Li

Figure 1 for Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Figure 2 for Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Figure 3 for Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Figure 4 for Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Share this with someone who'll enjoy it:

Abstract:3D Visual Grounding (3DVG) aims at localizing 3D object based on textual descriptions. Conventional supervised methods for 3DVG often necessitate extensive annotations and a predefined vocabulary, which can be restrictive. To address this issue, we propose a novel visual programming approach for zero-shot open-vocabulary 3DVG, leveraging the capabilities of large language models (LLMs). Our approach begins with a unique dialog-based method, engaging with LLMs to establish a foundational understanding of zero-shot 3DVG. Building on this, we design a visual program that consists of three types of modules, i.e., view-independent, view-dependent, and functional modules. These modules, specifically tailored for 3D scenarios, work collaboratively to perform complex reasoning and inference. Furthermore, we develop an innovative language-object correlation module to extend the scope of existing 3D object detectors into open-vocabulary scenarios. Extensive experiments demonstrate that our zero-shot approach can outperform some supervised baselines, marking a significant stride towards effective 3DVG.

* Under review, project website: https://curryyuan.github.io/ZSVG3D/

View paper on

Share this with someone who'll enjoy it:

Title:Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Paper and Code