Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Advancing Auto-Regressive Continuation for Video Frames

Dec 04, 2024

Ruibo Ming, Jingwei Wu, Zhewei Huang, Zhuoxuan Ju, Jianming HU, Lihui Peng, Shuchang Zhou

Figure 1 for Advancing Auto-Regressive Continuation for Video Frames

Figure 2 for Advancing Auto-Regressive Continuation for Video Frames

Figure 3 for Advancing Auto-Regressive Continuation for Video Frames

Figure 4 for Advancing Auto-Regressive Continuation for Video Frames

Share this with someone who'll enjoy it:

Abstract:Recent advances in auto-regressive large language models (LLMs) have shown their potential in generating high-quality text, inspiring researchers to apply them to image and video generation. This paper explores the application of LLMs to video continuation, a task essential for building world models and predicting future frames. In this paper, we tackle challenges including preventing degeneration in long-term frame generation and enhancing the quality of generated images. We design a scheme named ARCON, which involves training our model to alternately generate semantic tokens and RGB tokens, enabling the LLM to explicitly learn and predict the high-level structural information of the video. We find high consistency in the RGB images and semantic maps generated without special design. Moreover, we employ an optical flow-based texture stitching method to enhance the visual quality of the generated videos. Quantitative and qualitative experiments in autonomous driving scenarios demonstrate our model can consistently generate long videos.

* Under Review

View paper on

Share this with someone who'll enjoy it:

Title:Advancing Auto-Regressive Continuation for Video Frames

Paper and Code