Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:DINO-Foresight Looking into the Future with DINO

Dec 16, 2024

Efstathios Karypidis, Ioannis Kakogeorgiou, Spyros Gidaris, Nikos Komodakis

Figure 1 for DINO-Foresight Looking into the Future with DINO

Figure 2 for DINO-Foresight Looking into the Future with DINO

Figure 3 for DINO-Foresight Looking into the Future with DINO

Figure 4 for DINO-Foresight Looking into the Future with DINO

Share this with someone who'll enjoy it:

Abstract:Predicting future dynamics is crucial for applications like autonomous driving and robotics, where understanding the environment is key. Existing pixel-level methods are computationally expensive and often focus on irrelevant details. To address these challenges, we introduce $\texttt{DINO-Foresight}$, a novel framework that operates in the semantic feature space of pretrained Vision Foundation Models (VFMs). Our approach trains a masked feature transformer in a self-supervised manner to predict the evolution of VFM features over time. By forecasting these features, we can apply off-the-shelf, task-specific heads for various scene understanding tasks. In this framework, VFM features are treated as a latent space, to which different heads attach to perform specific tasks for future-frame analysis. Extensive experiments show that our framework outperforms existing methods, demonstrating its robustness and scalability. Additionally, we highlight how intermediate transformer representations in $\texttt{DINO-Foresight}$ improve downstream task performance, offering a promising path for the self-supervised enhancement of VFM features. We provide the implementation code at https://github.com/Sta8is/DINO-Foresight .

View paper on

Share this with someone who'll enjoy it:

Title:DINO-Foresight Looking into the Future with DINO

Paper and Code