Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jeongin Lee

InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation

Apr 30, 2024

Chanran Kim, Jeongin Lee, Shichang Joung, Bongmo Kim, Yeul-Min Baek

Figure 1 for InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation

Figure 2 for InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation

Figure 3 for InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation

Figure 4 for InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation

Abstract:In the field of personalized image generation, the ability to create images preserving concepts has significantly improved. Creating an image that naturally integrates multiple concepts in a cohesive and visually appealing composition can indeed be challenging. This paper introduces "InstantFamily," an approach that employs a novel masked cross-attention mechanism and a multimodal embedding stack to achieve zero-shot multi-ID image generation. Our method effectively preserves ID as it utilizes global and local features from a pre-trained face recognition model integrated with text conditions. Additionally, our masked cross-attention mechanism enables the precise control of multi-ID and composition in the generated images. We demonstrate the effectiveness of InstantFamily through experiments showing its dominance in generating images with multi-ID, while resolving well-known multi-ID generation problems. Additionally, our model achieves state-of-the-art performance in both single-ID and multi-ID preservation. Furthermore, our model exhibits remarkable scalability with a greater number of ID preservation than it was originally trained with.

Via

Access Paper or Ask Questions