Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shaorong Sun

COMOGen: A Controllable Text-to-3D Multi-object Generation Framework

Sep 01, 2024

Shaorong Sun, Shuchao Pang, Yazhou Yao, Xiaoshui Huang

Figure 1 for COMOGen: A Controllable Text-to-3D Multi-object Generation Framework

Figure 2 for COMOGen: A Controllable Text-to-3D Multi-object Generation Framework

Figure 3 for COMOGen: A Controllable Text-to-3D Multi-object Generation Framework

Figure 4 for COMOGen: A Controllable Text-to-3D Multi-object Generation Framework

Abstract:The controllability of 3D object generation methods is achieved through input text. Existing text-to-3D object generation methods primarily focus on generating a single object based on a single object description. However, these methods often face challenges in producing results that accurately correspond to our desired positions when the input text involves multiple objects. To address the issue of controllability in generating multiple objects, this paper introduces COMOGen, a COntrollable text-to-3D Multi-Object Generation framework. COMOGen enables the simultaneous generation of multiple 3D objects by the distillation of layout and multi-view prior knowledge. The framework consists of three modules: the layout control module, the multi-view consistency control module, and the 3D content enhancement module. Moreover, to integrate these three modules as an integral framework, we propose Layout Multi-view Score Distillation, which unifies two prior knowledge and further enhances the diversity and quality of generated 3D content. Comprehensive experiments demonstrate the effectiveness of our approach compared to the state-of-the-art methods, which represents a significant step forward in enabling more controlled and versatile text-based 3D content generation.

Via

Access Paper or Ask Questions