Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Apolinário Passos

Hugging Face

LEDITS++: Limitless Image Editing using Text-to-Image Models

Nov 28, 2023

Manuel Brack, Felix Friedrich, Katharina Kornmeier, Linoy Tsaban, Patrick Schramowski, Kristian Kersting, Apolinário Passos

Figure 1 for LEDITS++: Limitless Image Editing using Text-to-Image Models

Figure 2 for LEDITS++: Limitless Image Editing using Text-to-Image Models

Figure 3 for LEDITS++: Limitless Image Editing using Text-to-Image Models

Figure 4 for LEDITS++: Limitless Image Editing using Text-to-Image Models

Abstract:Text-to-image diffusion models have recently received increasing interest for their astonishing ability to produce high-fidelity images from solely text inputs. Subsequent research efforts aim to exploit and apply their capabilities to real image editing. However, existing image-to-image methods are often inefficient, imprecise, and of limited versatility. They either require time-consuming fine-tuning, deviate unnecessarily strongly from the input image, and/or lack support for multiple, simultaneous edits. To address these issues, we introduce LEDITS++, an efficient yet versatile and precise textual image manipulation technique. LEDITS++'s novel inversion approach requires no tuning nor optimization and produces high-fidelity results with a few diffusion steps. Second, our methodology supports multiple simultaneous edits and is architecture-agnostic. Third, we use a novel implicit masking technique that limits changes to relevant image regions. We propose the novel TEdBench++ benchmark as part of our exhaustive evaluation. Our results demonstrate the capabilities of LEDITS++ and its improvements over previous methods. The project page is available at https://leditsplusplus-project.static.hf.space .

Via

Access Paper or Ask Questions

LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Nov 09, 2023

Simian Luo, Yiqin Tan, Suraj Patil, Daniel Gu, Patrick von Platen, Apolinário Passos, Longbo Huang, Jian Li, Hang Zhao

Figure 1 for LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Figure 2 for LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Figure 3 for LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Figure 4 for LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Abstract:Latent Consistency Models (LCMs) have achieved impressive performance in accelerating text-to-image generative tasks, producing high-quality images with minimal inference steps. LCMs are distilled from pre-trained latent diffusion models (LDMs), requiring only ~32 A100 GPU training hours. This report further extends LCMs' potential in two aspects: First, by applying LoRA distillation to Stable-Diffusion models including SD-V1.5, SSD-1B, and SDXL, we have expanded LCM's scope to larger models with significantly less memory consumption, achieving superior image generation quality. Second, we identify the LoRA parameters obtained through LCM distillation as a universal Stable-Diffusion acceleration module, named LCM-LoRA. LCM-LoRA can be directly plugged into various Stable-Diffusion fine-tuned models or LoRAs without training, thus representing a universally applicable accelerator for diverse image generation tasks. Compared with previous numerical PF-ODE solvers such as DDIM, DPM-Solver, LCM-LoRA can be viewed as a plug-in neural PF-ODE solver that possesses strong generalization abilities. Project page: https://github.com/luosiallen/latent-consistency-model.

* Technical Report

Via

Access Paper or Ask Questions

LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance

Jul 02, 2023

Linoy Tsaban, Apolinário Passos

Figure 1 for LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance

Figure 2 for LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance

Figure 3 for LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance

Figure 4 for LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance

Abstract:Recent large-scale text-guided diffusion models provide powerful image-generation capabilities. Currently, a significant effort is given to enable the modification of these images using text only as means to offer intuitive and versatile editing. However, editing proves to be difficult for these generative models due to the inherent nature of editing techniques, which involves preserving certain content from the original image. Conversely, in text-based models, even minor modifications to the text prompt frequently result in an entirely distinct result, making attaining one-shot generation that accurately corresponds to the users intent exceedingly challenging. In addition, to edit a real image using these state-of-the-art tools, one must first invert the image into the pre-trained models domain - adding another factor affecting the edit quality, as well as latency. In this exploratory report, we propose LEDITS - a combined lightweight approach for real-image editing, incorporating the Edit Friendly DDPM inversion technique with Semantic Guidance, thus extending Semantic Guidance to real image editing, while harnessing the editing capabilities of DDPM inversion as well. This approach achieves versatile edits, both subtle and extensive as well as alterations in composition and style, while requiring no optimization nor extensions to the architecture.

* 8 pages, 5 figures, 1 table. This report builds up on the works introduced in - arXiv:2304.06140, arXiv:2301.12247

Via

Access Paper or Ask Questions