Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fanpeng Meng

DreamDissector: Learning Disentangled Text-to-3D Generation from 2D Diffusion Priors

Jul 23, 2024

Zizheng Yan, Jiapeng Zhou, Fanpeng Meng, Yushuang Wu, Lingteng Qiu, Zisheng Ye, Shuguang Cui, Guanying Chen, Xiaoguang Han

Figure 1 for DreamDissector: Learning Disentangled Text-to-3D Generation from 2D Diffusion Priors

Figure 2 for DreamDissector: Learning Disentangled Text-to-3D Generation from 2D Diffusion Priors

Figure 3 for DreamDissector: Learning Disentangled Text-to-3D Generation from 2D Diffusion Priors

Figure 4 for DreamDissector: Learning Disentangled Text-to-3D Generation from 2D Diffusion Priors

Abstract:Text-to-3D generation has recently seen significant progress. To enhance its practicality in real-world applications, it is crucial to generate multiple independent objects with interactions, similar to layer-compositing in 2D image editing. However, existing text-to-3D methods struggle with this task, as they are designed to generate either non-independent objects or independent objects lacking spatially plausible interactions. Addressing this, we propose DreamDissector, a text-to-3D method capable of generating multiple independent objects with interactions. DreamDissector accepts a multi-object text-to-3D NeRF as input and produces independent textured meshes. To achieve this, we introduce the Neural Category Field (NeCF) for disentangling the input NeRF. Additionally, we present the Category Score Distillation Sampling (CSDS), facilitated by a Deep Concept Mining (DCM) module, to tackle the concept gap issue in diffusion models. By leveraging NeCF and CSDS, we can effectively derive sub-NeRFs from the original scene. Further refinement enhances geometry and texture. Our experimental results validate the effectiveness of DreamDissector, providing users with novel means to control 3D synthesis at the object level and potentially opening avenues for various creative applications in the future.

* ECCV 2024. Project page: https://chester256.github.io/dreamdissector

Via

Access Paper or Ask Questions

3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification

Dec 01, 2022

Jiazhao Zhang, Liu Dai, Fanpeng Meng, Qingnan Fan, Xuelin Chen, Kai Xu, He Wang

Figure 1 for 3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification

Figure 2 for 3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification

Figure 3 for 3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification

Figure 4 for 3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification

Abstract:Object goal navigation (ObjectNav) in unseen environments is a fundamental task for Embodied AI. Agents in existing works learn ObjectNav policies based on 2D maps, scene graphs, or image sequences. Considering this task happens in 3D space, a 3D-aware agent can advance its ObjectNav capability via learning from fine-grained spatial information. However, leveraging 3D scene representation can be prohibitively unpractical for policy learning in this floor-level task, due to low sample efficiency and expensive computational cost. In this work, we propose a framework for the challenging 3D-aware ObjectNav based on two straightforward sub-policies. The two sub-polices, namely corner-guided exploration policy and category-aware identification policy, simultaneously perform by utilizing online fused 3D points as observation. Through extensive experiments, we show that this framework can dramatically improve the performance in ObjectNav through learning from 3D scene representation. Our framework achieves the best performance among all modular-based methods on the Matterport3D and Gibson datasets, while requiring (up to 30x) less computational cost for training.

Via

Access Paper or Ask Questions