Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dante De Nigris

Jointly Generating Multi-view Consistent PBR Textures using Collaborative Control

Oct 09, 2024

Shimon Vainer, Konstantin Kutsy, Dante De Nigris, Ciara Rowles, Slava Elizarov, Simon Donné

Abstract:Multi-view consistency remains a challenge for image diffusion models. Even within the Text-to-Texture problem, where perfect geometric correspondences are known a priori, many methods fail to yield aligned predictions across views, necessitating non-trivial fusion methods to incorporate the results onto the original mesh. We explore this issue for a Collaborative Control workflow specifically in PBR Text-to-Texture. Collaborative Control directly models PBR image probability distributions, including normal bump maps; to our knowledge, the only diffusion model to directly output full PBR stacks. We discuss the design decisions involved in making this model multi-view consistent, and demonstrate the effectiveness of our approach in ablation studies, as well as practical applications.

* 19 pages, 13 figures

Via

Access Paper or Ask Questions

IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

Aug 06, 2024

Ciara Rowles, Shimon Vainer, Dante De Nigris, Slava Elizarov, Konstantin Kutsy, Simon Donné

Figure 1 for IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

Figure 2 for IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

Figure 3 for IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

Figure 4 for IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

Abstract:Diffusion models continuously push the boundary of state-of-the-art image generation, but the process is hard to control with any nuance: practice proves that textual prompts are inadequate for accurately describing image style or fine structural details (such as faces). ControlNet and IPAdapter address this shortcoming by conditioning the generative process on imagery instead, but each individual instance is limited to modeling a single conditional posterior: for practical use-cases, where multiple different posteriors are desired within the same workflow, training and using multiple adapters is cumbersome. We propose IPAdapter-Instruct, which combines natural-image conditioning with ``Instruct'' prompts to swap between interpretations for the same conditioning image: style transfer, object extraction, both, or something else still? IPAdapterInstruct efficiently learns multiple tasks with minimal loss in quality compared to dedicated per-task models.

* 17 pages, 10 figures, Project page: https://unity-research.github.io/IP-Adapter-Instruct.github.io/

Via

Access Paper or Ask Questions

Collaborative Control for Geometry-Conditioned PBR Image Generation

Feb 20, 2024

Shimon Vainer, Mark Boss, Mathias Parger, Konstantin Kutsy, Dante De Nigris, Ciara Rowles, Nicolas Perony, Simon Donné

Abstract:Current 3D content generation approaches build on diffusion models that output RGB images. Modern graphics pipelines, however, require physically-based rendering (PBR) material properties. We propose to model the PBR image distribution directly, avoiding photometric inaccuracies in RGB generation and the inherent ambiguity in extracting PBR from RGB. Existing paradigms for cross-modal fine-tuning are not suited for PBR generation due to both a lack of data and the high dimensionality of the output modalities: we overcome both challenges by retaining a frozen RGB model and tightly linking a newly trained PBR model using a novel cross-network communication paradigm. As the base RGB model is fully frozen, the proposed method does not risk catastrophic forgetting during fine-tuning and remains compatible with techniques such as IPAdapter pretrained for the base RGB model. We validate our design choices, robustness to data sparsity, and compare against existing paradigms with an extensive experimental section.

* 19 pages, 10 figures; Project page: https://unity-research.github.io/holo-gen/

Via

Access Paper or Ask Questions