Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation

Mar 08, 2024

Junyan Wang, Zhenhong Sun, Zhiyu Tan, Xuanbai Chen, Weihua Chen, Hao Li, Cheng Zhang, Yang Song

Figure 1 for Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation

Figure 2 for Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation

Figure 3 for Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation

Figure 4 for Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation

Share this with someone who'll enjoy it:

Abstract:Vanilla text-to-image diffusion models struggle with generating accurate human images, commonly resulting in imperfect anatomies such as unnatural postures or disproportionate limbs.Existing methods address this issue mostly by fine-tuning the model with extra images or adding additional controls -- human-centric priors such as pose or depth maps -- during the image generation phase. This paper explores the integration of these human-centric priors directly into the model fine-tuning stage, essentially eliminating the need for extra conditions at the inference stage. We realize this idea by proposing a human-centric alignment loss to strengthen human-related information from the textual prompts within the cross-attention maps. To ensure semantic detail richness and human structural accuracy during fine-tuning, we introduce scale-aware and step-wise constraints within the diffusion process, according to an in-depth analysis of the cross-attention layer. Extensive experiments show that our method largely improves over state-of-the-art text-to-image models to synthesize high-quality human images based on user-written prompts. Project page: \url{https://hcplayercvpr2024.github.io}.

* Accepted to CVPR 2024

View paper on

Share this with someone who'll enjoy it:

Title:Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation

Paper and Code