Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing

Jun 14, 2023

Paul Couairon, Clément Rambour, Jean-Emmanuel Haugeard, Nicolas Thome

Figure 1 for VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing

Figure 2 for VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing

Figure 3 for VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing

Figure 4 for VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing

Share this with someone who'll enjoy it:

Abstract:Recently, diffusion-based generative models have achieved remarkable success for image generation and edition. However, their use for video editing still faces important limitations. This paper introduces VidEdit, a novel method for zero-shot text-based video editing ensuring strong temporal and spatial consistency. Firstly, we propose to combine atlas-based and pre-trained text-to-image diffusion models to provide a training-free and efficient editing method, which by design fulfills temporal smoothness. Secondly, we leverage off-the-shelf panoptic segmenters along with edge detectors and adapt their use for conditioned diffusion-based atlas editing. This ensures a fine spatial control on targeted regions while strictly preserving the structure of the original video. Quantitative and qualitative experiments show that VidEdit outperforms state-of-the-art methods on DAVIS dataset, regarding semantic faithfulness, image preservation, and temporal consistency metrics. With this framework, processing a single video only takes approximately one minute, and it can generate multiple compatible edits based on a unique text prompt. Project web-page at https://videdit.github.io

* Project web-page at https://videdit.github.io

View paper on

Share this with someone who'll enjoy it:

Title:VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing

Paper and Code