Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Locate, Assign, Refine: Taming Customized Image Inpainting with Text-Subject Guidance

Mar 28, 2024

Yulin Pan, Chaojie Mao, Zeyinzi Jiang, Zhen Han, Jingfeng Zhang

Figure 1 for Locate, Assign, Refine: Taming Customized Image Inpainting with Text-Subject Guidance

Figure 2 for Locate, Assign, Refine: Taming Customized Image Inpainting with Text-Subject Guidance

Figure 3 for Locate, Assign, Refine: Taming Customized Image Inpainting with Text-Subject Guidance

Figure 4 for Locate, Assign, Refine: Taming Customized Image Inpainting with Text-Subject Guidance

Share this with someone who'll enjoy it:

Abstract:Prior studies have made significant progress in image inpainting guided by either text or subject image. However, the research on editing with their combined guidance is still in the early stages. To tackle this challenge, we present LAR-Gen, a novel approach for image inpainting that enables seamless inpainting of masked scene images, incorporating both the textual prompts and specified subjects. Our approach adopts a coarse-to-fine manner to ensure subject identity preservation and local semantic coherence. The process involves (i) Locate: concatenating the noise with masked scene image to achieve precise regional editing, (ii) Assign: employing decoupled cross-attention mechanism to accommodate multi-modal guidance, and (iii) Refine: using a novel RefineNet to supplement subject details. Additionally, to address the issue of scarce training data, we introduce a novel data construction pipeline. This pipeline extracts substantial pairs of data consisting of local text prompts and corresponding visual instances from a vast image dataset, leveraging publicly available large models. Extensive experiments and varied application scenarios demonstrate the superiority of LAR-Gen in terms of both identity preservation and text semantic consistency. Project page can be found at \url{https://ali-vilab.github.io/largen-page/}.

* 22 pages, 14 figures

View paper on

Share this with someone who'll enjoy it:

Title:Locate, Assign, Refine: Taming Customized Image Inpainting with Text-Subject Guidance

Paper and Code