Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:One-Step Image Translation with Text-to-Image Models

Mar 18, 2024

Gaurav Parmar, Taesung Park, Srinivasa Narasimhan, Jun-Yan Zhu

Figure 1 for One-Step Image Translation with Text-to-Image Models

Figure 2 for One-Step Image Translation with Text-to-Image Models

Figure 3 for One-Step Image Translation with Text-to-Image Models

Figure 4 for One-Step Image Translation with Text-to-Image Models

Share this with someone who'll enjoy it:

Abstract:In this work, we address two limitations of existing conditional diffusion models: their slow inference speed due to the iterative denoising process and their reliance on paired data for model fine-tuning. To tackle these issues, we introduce a general method for adapting a single-step diffusion model to new tasks and domains through adversarial learning objectives. Specifically, we consolidate various modules of the vanilla latent diffusion model into a single end-to-end generator network with small trainable weights, enhancing its ability to preserve the input image structure while reducing overfitting. We demonstrate that, for unpaired settings, our model CycleGAN-Turbo outperforms existing GAN-based and diffusion-based methods for various scene translation tasks, such as day-to-night conversion and adding/removing weather effects like fog, snow, and rain. We extend our method to paired settings, where our model pix2pix-Turbo is on par with recent works like Control-Net for Sketch2Photo and Edge2Image, but with a single-step inference. This work suggests that single-step diffusion models can serve as strong backbones for a range of GAN learning objectives. Our code and models are available at https://github.com/GaParmar/img2img-turbo.

* Github: https://github.com/GaParmar/img2img-turbo

View paper on

Share this with someone who'll enjoy it:

Title:One-Step Image Translation with Text-to-Image Models

Paper and Code