Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis

Nov 28, 2023

Xiaohui Chen, Yongfei Liu, Yingxiang Yang, Jianbo Yuan, Quanzeng You, Li-Ping Liu, Hongxia Yang

Figure 1 for Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis

Figure 2 for Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis

Figure 3 for Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis

Figure 4 for Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis

Share this with someone who'll enjoy it:

Abstract:Recent advancements in text-to-image (T2I) generative models have shown remarkable capabilities in producing diverse and imaginative visuals based on text prompts. Despite the advancement, these diffusion models sometimes struggle to translate the semantic content from the text into images entirely. While conditioning on the layout has shown to be effective in improving the compositional ability of T2I diffusion models, they typically require manual layout input. In this work, we introduce a novel approach to improving T2I diffusion models using Large Language Models (LLMs) as layout generators. Our method leverages the Chain-of-Thought prompting of LLMs to interpret text and generate spatially reasonable object layouts. The generated layout is then used to enhance the generated images' composition and spatial accuracy. Moreover, we propose an efficient adapter based on a cross-attention mechanism, which explicitly integrates the layout information into the stable diffusion models. Our experiments demonstrate significant improvements in image quality and layout accuracy, showcasing the potential of LLMs in augmenting generative image models.

* preprint

View paper on

Share this with someone who'll enjoy it:

Title:Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis

Paper and Code