Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Expressive Text-to-Image Generation with Rich Text

Apr 13, 2023

Songwei Ge, Taesung Park, Jun-Yan Zhu, Jia-Bin Huang

Figure 1 for Expressive Text-to-Image Generation with Rich Text

Figure 2 for Expressive Text-to-Image Generation with Rich Text

Figure 3 for Expressive Text-to-Image Generation with Rich Text

Figure 4 for Expressive Text-to-Image Generation with Rich Text

Share this with someone who'll enjoy it:

Abstract:Plain text has become a prevalent interface for text-to-image synthesis. However, its limited customization options hinder users from accurately describing desired outputs. For example, plain text makes it hard to specify continuous quantities, such as the precise RGB color value or importance of each word. Furthermore, creating detailed text prompts for complex scenes is tedious for humans to write and challenging for text encoders to interpret. To address these challenges, we propose using a rich-text editor supporting formats such as font style, size, color, and footnote. We extract each word's attributes from rich text to enable local style control, explicit token reweighting, precise color rendering, and detailed region synthesis. We achieve these capabilities through a region-based diffusion process. We first obtain each word's region based on cross-attention maps of a vanilla diffusion process using plain text. For each region, we enforce its text attributes by creating region-specific detailed prompts and applying region-specific guidance. We present various examples of image generation from rich text and demonstrate that our method outperforms strong baselines with quantitative evaluations.

* Project webpage: https://rich-text-to-image.github.io/

View paper on

Share this with someone who'll enjoy it:

Title:Expressive Text-to-Image Generation with Rich Text

Paper and Code