Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Auto DragGAN: Editing the Generative Image Manifold in an Autoregressive Manner

Jul 26, 2024

Pengxiang Cai, Zhiwei Liu, Guibo Zhu, Yunfang Niu, Jinqiao Wang

Figure 1 for Auto DragGAN: Editing the Generative Image Manifold in an Autoregressive Manner

Figure 2 for Auto DragGAN: Editing the Generative Image Manifold in an Autoregressive Manner

Figure 3 for Auto DragGAN: Editing the Generative Image Manifold in an Autoregressive Manner

Figure 4 for Auto DragGAN: Editing the Generative Image Manifold in an Autoregressive Manner

Share this with someone who'll enjoy it:

Abstract:Pixel-level fine-grained image editing remains an open challenge. Previous works fail to achieve an ideal trade-off between control granularity and inference speed. They either fail to achieve pixel-level fine-grained control, or their inference speed requires optimization. To address this, this paper for the first time employs a regression-based network to learn the variation patterns of StyleGAN latent codes during the image dragging process. This method enables pixel-level precision in dragging editing with little time cost. Users can specify handle points and their corresponding target points on any GAN-generated images, and our method will move each handle point to its corresponding target point. Through experimental analysis, we discover that a short movement distance from handle points to target points yields a high-fidelity edited image, as the model only needs to predict the movement of a small portion of pixels. To achieve this, we decompose the entire movement process into multiple sub-processes. Specifically, we develop a transformer encoder-decoder based network named 'Latent Predictor' to predict the latent code motion trajectories from handle points to target points in an autoregressive manner. Moreover, to enhance the prediction stability, we introduce a component named 'Latent Regularizer', aimed at constraining the latent code motion within the distribution of natural images. Extensive experiments demonstrate that our method achieves state-of-the-art (SOTA) inference speed and image editing performance at the pixel-level granularity.

* This paper has been accepted as a poster paper for ACM Multimedia 2024

View paper on

Share this with someone who'll enjoy it:

Title:Auto DragGAN: Editing the Generative Image Manifold in an Autoregressive Manner

Paper and Code