Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:PredFormer: Transformers Are Effective Spatial-Temporal Predictive Learners

Oct 07, 2024

Yujin Tang, Lu Qi, Fei Xie, Xiangtai Li, Chao Ma, Ming-Hsuan Yang

Figure 1 for PredFormer: Transformers Are Effective Spatial-Temporal Predictive Learners

Figure 2 for PredFormer: Transformers Are Effective Spatial-Temporal Predictive Learners

Figure 3 for PredFormer: Transformers Are Effective Spatial-Temporal Predictive Learners

Figure 4 for PredFormer: Transformers Are Effective Spatial-Temporal Predictive Learners

Share this with someone who'll enjoy it:

Abstract:Spatiotemporal predictive learning methods generally fall into two categories: recurrent-based approaches, which face challenges in parallelization and performance, and recurrent-free methods, which employ convolutional neural networks (CNNs) as encoder-decoder architectures. These methods benefit from strong inductive biases but often at the expense of scalability and generalization. This paper proposes PredFormer, a pure transformer-based framework for spatiotemporal predictive learning. Motivated by the Vision Transformers (ViT) design, PredFormer leverages carefully designed Gated Transformer blocks, following a comprehensive analysis of 3D attention mechanisms, including full-, factorized-, and interleaved- spatial-temporal attention. With its recurrent-free, transformer-based design, PredFormer is both simple and efficient, significantly outperforming previous methods by large margins. Extensive experiments on synthetic and real-world datasets demonstrate that PredFormer achieves state-of-the-art performance. On Moving MNIST, PredFormer achieves a 51.3% reduction in MSE relative to SimVP. For TaxiBJ, the model decreases MSE by 33.1% and boosts FPS from 533 to 2364. Additionally, on WeatherBench, it reduces MSE by 11.1% while enhancing FPS from 196 to 404. These performance gains in both accuracy and efficiency demonstrate PredFormer's potential for real-world applications. The source code will be released at https://github.com/yyyujintang/PredFormer.

* 15 pages, 7 figures

View paper on

Share this with someone who'll enjoy it:

Title:PredFormer: Transformers Are Effective Spatial-Temporal Predictive Learners

Paper and Code