Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:User-Friendly Customized Generation with Multi-Modal Prompts

May 26, 2024

Linhao Zhong, Yan Hong, Wentao Chen, Binglin Zhou, Yiyi Zhang, Jianfu Zhang, Liqing Zhang

Figure 1 for User-Friendly Customized Generation with Multi-Modal Prompts

Figure 2 for User-Friendly Customized Generation with Multi-Modal Prompts

Figure 3 for User-Friendly Customized Generation with Multi-Modal Prompts

Figure 4 for User-Friendly Customized Generation with Multi-Modal Prompts

Share this with someone who'll enjoy it:

Abstract:Text-to-image generation models have seen considerable advancement, catering to the increasing interest in personalized image creation. Current customization techniques often necessitate users to provide multiple images (typically 3-5) for each customized object, along with the classification of these objects and descriptive textual prompts for scenes. This paper questions whether the process can be made more user-friendly and the customization more intricate. We propose a method where users need only provide images along with text for each customization topic, and necessitates only a single image per visual concept. We introduce the concept of a ``multi-modal prompt'', a novel integration of text and images tailored to each customization concept, which simplifies user interaction and facilitates precise customization of both objects and scenes. Our proposed paradigm for customized text-to-image generation surpasses existing finetune-based methods in user-friendliness and the ability to customize complex objects with user-friendly inputs. Our code is available at $\href{https://github.com/zhongzero/Multi-Modal-Prompt}{https://github.com/zhongzero/Multi-Modal-Prompt}$.

* 11 pages, 8 figures

View paper on

Share this with someone who'll enjoy it:

Title:User-Friendly Customized Generation with Multi-Modal Prompts

Paper and Code