Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:SPDiffusion: Semantic Protection Diffusion for Multi-concept Text-to-image Generation

Sep 02, 2024

Yang Zhang, Rui Zhang, Xuecheng Nie, Haochen Li, Jikun Chen, Yifan Hao, Xin Zhang, Luoqi Liu, Ling Li

Figure 1 for SPDiffusion: Semantic Protection Diffusion for Multi-concept Text-to-image Generation

Figure 2 for SPDiffusion: Semantic Protection Diffusion for Multi-concept Text-to-image Generation

Figure 3 for SPDiffusion: Semantic Protection Diffusion for Multi-concept Text-to-image Generation

Figure 4 for SPDiffusion: Semantic Protection Diffusion for Multi-concept Text-to-image Generation

Share this with someone who'll enjoy it:

Abstract:Recent text-to-image models have achieved remarkable success in generating high-quality images. However, when tasked with multi-concept generation which creates images containing multiple characters or objects, existing methods often suffer from attribute confusion, resulting in severe text-image inconsistency. We found that attribute confusion occurs when a certain region of the latent features attend to multiple or incorrect prompt tokens. In this work, we propose novel Semantic Protection Diffusion (SPDiffusion) to protect the semantics of regions from the influence of irrelevant tokens, eliminating the confusion of non-corresponding attributes. In the SPDiffusion framework, we design a Semantic Protection Mask (SP-Mask) to represent the relevance of the regions and the tokens, and propose a Semantic Protection Cross-Attention (SP-Attn) to shield the influence of irrelevant tokens on specific regions in the generation process. To evaluate our method, we created a diverse multi-concept benchmark, and SPDiffusion achieves state-of-the-art results on this benchmark, proving its effectiveness. Our method can be combined with many other application methods or backbones, such as ControlNet, Story Diffusion, PhotoMaker and PixArt-alpha to enhance their multi-concept capabilities, demonstrating strong compatibility and scalability.

View paper on

Share this with someone who'll enjoy it:

Title:SPDiffusion: Semantic Protection Diffusion for Multi-concept Text-to-image Generation

Paper and Code