Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

Dec 02, 2022

Gang Li, Heliang Zheng, Chaoyue Wang, Chang Li, Changwen Zheng, Dacheng Tao

Figure 1 for 3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

Figure 2 for 3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

Figure 3 for 3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

Figure 4 for 3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

Share this with someone who'll enjoy it:

Abstract:Text-guided diffusion models have shown superior performance in image/video generation and editing. While few explorations have been performed in 3D scenarios. In this paper, we discuss three fundamental and interesting problems on this topic. First, we equip text-guided diffusion models to achieve $\textbf{3D-consistent generation}$. Specifically, we integrate a NeRF-like neural field to generate low-resolution coarse results for a given camera view. Such results can provide 3D priors as condition information for the following diffusion process. During denoising diffusion, we further enhance the 3D consistency by modeling cross-view correspondences with a novel two-stream (corresponding to two different views) asynchronous diffusion process. Second, we study $\textbf{3D local editing}$ and propose a two-step solution that can generate 360$^{\circ}$ manipulated results by editing an object from a single view. Step 1, we propose to perform 2D local editing by blending the predicted noises. Step 2, we conduct a noise-to-text inversion process that maps 2D blended noises into the view-independent text embedding space. Once the corresponding text embedding is obtained, 360$^{\circ}$ images can be generated. Last but not least, we extend our model to perform \textbf{one-shot novel view synthesis} by fine-tuning on a single image, firstly showing the potential of leveraging text guidance for novel view synthesis. Extensive experiments and various applications show the prowess of our 3DDesigner. The project page is available at https://3ddesigner-diffusion.github.io/.

* 15 pages, 12 figures, conference

View paper on

Share this with someone who'll enjoy it:

Title:3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

Paper and Code