Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Dense Text-to-Image Generation with Attention Modulation

Aug 24, 2023

Yunji Kim, Jiyoung Lee, Jin-Hwa Kim, Jung-Woo Ha, Jun-Yan Zhu

Figure 1 for Dense Text-to-Image Generation with Attention Modulation

Figure 2 for Dense Text-to-Image Generation with Attention Modulation

Figure 3 for Dense Text-to-Image Generation with Attention Modulation

Figure 4 for Dense Text-to-Image Generation with Attention Modulation

Share this with someone who'll enjoy it:

Abstract:Existing text-to-image diffusion models struggle to synthesize realistic images given dense captions, where each text prompt provides a detailed description for a specific image region. To address this, we propose DenseDiffusion, a training-free method that adapts a pre-trained text-to-image model to handle such dense captions while offering control over the scene layout. We first analyze the relationship between generated images' layouts and the pre-trained model's intermediate attention maps. Next, we develop an attention modulation method that guides objects to appear in specific regions according to layout guidance. Without requiring additional fine-tuning or datasets, we improve image generation performance given dense captions regarding both automatic and human evaluation scores. In addition, we achieve similar-quality visual results with models specifically trained with layout conditions.

* Accepted by ICCV2023. Code and data are available at https://github.com/naver-ai/DenseDiffusion

View paper on

Share this with someone who'll enjoy it:

Title:Dense Text-to-Image Generation with Attention Modulation

Paper and Code