Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:PEEKABOO: Interactive Video Generation via Masked-Diffusion

Dec 12, 2023

Yash Jain, Anshul Nasery, Vibhav Vineet, Harkirat Behl

Share this with someone who'll enjoy it:

Abstract:Recently there has been a lot of progress in text-to-video generation, with state-of-the-art models being capable of generating high quality, realistic videos. However, these models lack the capability for users to interactively control and generate videos, which can potentially unlock new areas of application. As a first step towards this goal, we tackle the problem of endowing diffusion-based video generation models with interactive spatio-temporal control over their output. To this end, we take inspiration from the recent advances in segmentation literature to propose a novel spatio-temporal masked attention module - Peekaboo. This module is a training-free, no-inference-overhead addition to off-the-shelf video generation models which enables spatio-temporal control. We also propose an evaluation benchmark for the interactive video generation task. Through extensive qualitative and quantitative evaluation, we establish that Peekaboo enables control video generation and even obtains a gain of upto 3.8x in mIoU over baseline models.

* Project webpage - https://jinga-lala.github.io/projects/Peekaboo/

View paper on

Share this with someone who'll enjoy it:

Title:PEEKABOO: Interactive Video Generation via Masked-Diffusion

Paper and Code