Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kuan Heng Lin

Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance

Jun 11, 2024

Kuan Heng Lin, Sicheng Mo, Ben Klingher, Fangzhou Mu, Bolei Zhou

Figure 1 for Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance

Figure 2 for Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance

Figure 3 for Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance

Figure 4 for Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance

Abstract:Recent controllable generation approaches such as FreeControl and Diffusion Self-guidance bring fine-grained spatial and appearance control to text-to-image (T2I) diffusion models without training auxiliary modules. However, these methods optimize the latent embedding for each type of score function with longer diffusion steps, making the generation process time-consuming and limiting their flexibility and use. This work presents Ctrl-X, a simple framework for T2I diffusion controlling structure and appearance without additional training or guidance. Ctrl-X designs feed-forward structure control to enable the structure alignment with a structure image and semantic-aware appearance transfer to facilitate the appearance transfer from a user-input image. Extensive qualitative and quantitative experiments illustrate the superior performance of Ctrl-X on various condition inputs and model checkpoints. In particular, Ctrl-X supports novel structure and appearance control with arbitrary condition images of any modality, exhibits superior image quality and appearance transfer compared to existing works, and provides instant plug-and-play functionality to any T2I and text-to-video (T2V) diffusion model. See our project page for an overview of the results: https://genforce.github.io/ctrl-x

* 18 pages, 11 figures, see project page at https://genforce.github.io/ctrl-x

Via

Access Paper or Ask Questions

FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Dec 12, 2023

Sicheng Mo, Fangzhou Mu, Kuan Heng Lin, Yanli Liu, Bochen Guan, Yin Li, Bolei Zhou

Figure 1 for FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Figure 2 for FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Figure 3 for FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Figure 4 for FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Abstract:Recent approaches such as ControlNet offer users fine-grained spatial control over text-to-image (T2I) diffusion models. However, auxiliary modules have to be trained for each type of spatial condition, model architecture, and checkpoint, putting them at odds with the diverse intents and preferences a human designer would like to convey to the AI models during the content creation process. In this work, we present FreeControl, a training-free approach for controllable T2I generation that supports multiple conditions, architectures, and checkpoints simultaneously. FreeControl designs structure guidance to facilitate the structure alignment with a guidance image, and appearance guidance to enable the appearance sharing between images generated using the same seed. Extensive qualitative and quantitative experiments demonstrate the superior performance of FreeControl across a variety of pre-trained T2I models. In particular, FreeControl facilitates convenient training-free control over many different architectures and checkpoints, allows the challenging input conditions on which most of the existing training-free methods fail, and achieves competitive synthesis quality with training-based approaches.

* Project Page: https://genforce.github.io/freecontrol/

Via

Access Paper or Ask Questions