Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide

Oct 06, 2024

Dohun Lee, Bryan S Kim, Geon Yeong Park, Jong Chul Ye

Figure 1 for VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide

Figure 2 for VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide

Figure 3 for VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide

Figure 4 for VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide

Share this with someone who'll enjoy it:

Abstract:Text-to-image (T2I) diffusion models have revolutionized visual content creation, but extending these capabilities to text-to-video (T2V) generation remains a challenge, particularly in preserving temporal consistency. Existing methods that aim to improve consistency often cause trade-offs such as reduced imaging quality and impractical computational time. To address these issues we introduce VideoGuide, a novel framework that enhances the temporal consistency of pretrained T2V models without the need for additional training or fine-tuning. Instead, VideoGuide leverages any pretrained video diffusion model (VDM) or itself as a guide during the early stages of inference, improving temporal quality by interpolating the guiding model's denoised samples into the sampling model's denoising process. The proposed method brings about significant improvement in temporal consistency and image fidelity, providing a cost-effective and practical solution that synergizes the strengths of various video diffusion models. Furthermore, we demonstrate prior distillation, revealing that base models can achieve enhanced text coherence by utilizing the superior data prior of the guiding model through the proposed method. Project Page: http://videoguide2025.github.io/

* 24 pages, 14 figures, Project Page: http://videoguide2025.github.io/

View paper on

Share this with someone who'll enjoy it:

Title:VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide

Paper and Code