Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Controllable Textual Inversion for Personalized Text-to-Image Generation

Apr 12, 2023

Jianan Yang, Haobo Wang, Ruixuan Xiao, Sai Wu, Gang Chen, Junbo Zhao

Figure 1 for Controllable Textual Inversion for Personalized Text-to-Image Generation

Figure 2 for Controllable Textual Inversion for Personalized Text-to-Image Generation

Figure 3 for Controllable Textual Inversion for Personalized Text-to-Image Generation

Figure 4 for Controllable Textual Inversion for Personalized Text-to-Image Generation

Share this with someone who'll enjoy it:

Abstract:The recent large-scale generative modeling has attained unprecedented performance especially in producing high-fidelity images driven by text prompts. Text inversion (TI), alongside the text-to-image model backbones, is proposed as an effective technique in personalizing the generation when the prompts contain user-defined, unseen or long-tail concept tokens. Despite that, we find and show that the deployment of TI remains full of "dark-magics" -- to name a few, the harsh requirement of additional datasets, arduous human efforts in the loop and lack of robustness. In this work, we propose a much-enhanced version of TI, dubbed Controllable Textual Inversion (COTI), in resolving all the aforementioned problems and in turn delivering a robust, data-efficient and easy-to-use framework. The core to COTI is a theoretically-guided loss objective instantiated with a comprehensive and novel weighted scoring mechanism, encapsulated by an active-learning paradigm. The extensive results show that COTI significantly outperforms the prior TI-related approaches with a 26.05 decrease in the FID score and a 23.00% boost in the R-precision.

* 10 pages, 6 figures, 2 tables. Project Page: https://github.com/jnzju/COTI

View paper on

Share this with someone who'll enjoy it:

Title:Controllable Textual Inversion for Personalized Text-to-Image Generation

Paper and Code