Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Nov 28, 2024

Weimin Qiu, Jieke Wang, Meng Tang

Figure 1 for Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Figure 2 for Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Figure 3 for Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Figure 4 for Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Share this with someone who'll enjoy it:

Abstract:Diffusion models have achieved unprecedented fidelity and diversity for synthesizing image, video, 3D assets, etc. However, subject mixing is a known and unresolved issue for diffusion-based image synthesis, particularly for synthesizing multiple similar-looking subjects. We propose Self-Cross diffusion guidance to penalize the overlap between cross-attention maps and aggregated self-attention maps. Compared to previous methods based on self-attention or cross-attention alone, our self-cross guidance is more effective in eliminating subject mixing. What's more, our guidance addresses mixing for all relevant patches of a subject beyond the most discriminant one, e.g., beak of a bird. We aggregate self-attention maps of automatically selected patches for a subject to form a region that the whole subject attends to. Our method is training-free and can boost the performance of any transformer-based diffusion model such as Stable Diffusion.% for synthesizing similar subjects. We also release a more challenging benchmark with many text prompts of similar-looking subjects and utilize GPT-4o for automatic and reliable evaluation. Extensive qualitative and quantitative results demonstrate the effectiveness of our Self-Cross guidance.

View paper on

Share this with someone who'll enjoy it:

Title:Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Paper and Code