Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Slava Elizarov

Jointly Generating Multi-view Consistent PBR Textures using Collaborative Control

Oct 09, 2024

Shimon Vainer, Konstantin Kutsy, Dante De Nigris, Ciara Rowles, Slava Elizarov, Simon Donné

Abstract:Multi-view consistency remains a challenge for image diffusion models. Even within the Text-to-Texture problem, where perfect geometric correspondences are known a priori, many methods fail to yield aligned predictions across views, necessitating non-trivial fusion methods to incorporate the results onto the original mesh. We explore this issue for a Collaborative Control workflow specifically in PBR Text-to-Texture. Collaborative Control directly models PBR image probability distributions, including normal bump maps; to our knowledge, the only diffusion model to directly output full PBR stacks. We discuss the design decisions involved in making this model multi-view consistent, and demonstrate the effectiveness of our approach in ablation studies, as well as practical applications.

* 19 pages, 13 figures

Via

Access Paper or Ask Questions

Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation

Sep 05, 2024

Slava Elizarov, Ciara Rowles, Simon Donné

Abstract:Generating high-quality 3D objects from textual descriptions remains a challenging problem due to computational cost, the scarcity of 3D data, and complex 3D representations. We introduce Geometry Image Diffusion (GIMDiffusion), a novel Text-to-3D model that utilizes geometry images to efficiently represent 3D shapes using 2D images, thereby avoiding the need for complex 3D-aware architectures. By integrating a Collaborative Control mechanism, we exploit the rich 2D priors of existing Text-to-Image models such as Stable Diffusion. This enables strong generalization even with limited 3D training data (allowing us to use only high-quality training data) as well as retaining compatibility with guidance techniques such as IPAdapter. In short, GIMDiffusion enables the generation of 3D assets at speeds comparable to current Text-to-Image models. The generated objects consist of semantically meaningful, separate parts and include internal structures, enhancing both usability and versatility.

* 11 pages, 9 figures, Project page: https://unity-research.github.io/Geometry-Image-Diffusion.github.io/

Via

Access Paper or Ask Questions

IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

Aug 06, 2024

Ciara Rowles, Shimon Vainer, Dante De Nigris, Slava Elizarov, Konstantin Kutsy, Simon Donné

Figure 1 for IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

Figure 2 for IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

Figure 3 for IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

Figure 4 for IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

Abstract:Diffusion models continuously push the boundary of state-of-the-art image generation, but the process is hard to control with any nuance: practice proves that textual prompts are inadequate for accurately describing image style or fine structural details (such as faces). ControlNet and IPAdapter address this shortcoming by conditioning the generative process on imagery instead, but each individual instance is limited to modeling a single conditional posterior: for practical use-cases, where multiple different posteriors are desired within the same workflow, training and using multiple adapters is cumbersome. We propose IPAdapter-Instruct, which combines natural-image conditioning with ``Instruct'' prompts to swap between interpretations for the same conditioning image: style transfer, object extraction, both, or something else still? IPAdapterInstruct efficiently learns multiple tasks with minimal loss in quality compared to dedicated per-task models.

* 17 pages, 10 figures, Project page: https://unity-research.github.io/IP-Adapter-Instruct.github.io/

Via

Access Paper or Ask Questions