Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:3D-GRES: Generalized 3D Referring Expression Segmentation

Jul 31, 2024

Changli Wu, Yihang Liu, Jiayi Ji, Yiwei Ma, Haowei Wang, Gen Luo, Henghui Ding, Xiaoshuai Sun, Rongrong Ji

Figure 1 for 3D-GRES: Generalized 3D Referring Expression Segmentation

Figure 2 for 3D-GRES: Generalized 3D Referring Expression Segmentation

Figure 3 for 3D-GRES: Generalized 3D Referring Expression Segmentation

Figure 4 for 3D-GRES: Generalized 3D Referring Expression Segmentation

Share this with someone who'll enjoy it:

Abstract:3D Referring Expression Segmentation (3D-RES) is dedicated to segmenting a specific instance within a 3D space based on a natural language description. However, current approaches are limited to segmenting a single target, restricting the versatility of the task. To overcome this limitation, we introduce Generalized 3D Referring Expression Segmentation (3D-GRES), which extends the capability to segment any number of instances based on natural language instructions. In addressing this broader task, we propose the Multi-Query Decoupled Interaction Network (MDIN), designed to break down multi-object segmentation tasks into simpler, individual segmentations. MDIN comprises two fundamental components: Text-driven Sparse Queries (TSQ) and Multi-object Decoupling Optimization (MDO). TSQ generates sparse point cloud features distributed over key targets as the initialization for queries. Meanwhile, MDO is tasked with assigning each target in multi-object scenarios to different queries while maintaining their semantic consistency. To adapt to this new task, we build a new dataset, namely Multi3DRes. Our comprehensive evaluations on this dataset demonstrate substantial enhancements over existing models, thus charting a new path for intricate multi-object 3D scene comprehension. The benchmark and code are available at https://github.com/sosppxo/MDIN.

* Accepted by ACM MM 2024 (Oral), Code: https://github.com/sosppxo/MDIN

View paper on

Share this with someone who'll enjoy it:

Title:3D-GRES: Generalized 3D Referring Expression Segmentation

Paper and Code