Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:BootsTAP: Bootstrapped Training for Tracking-Any-Point

Feb 01, 2024

Carl Doersch, Yi Yang, Dilara Gokay, Pauline Luc, Skanda Koppula, Ankush Gupta, Joseph Heyward, Ross Goroshin, João Carreira, Andrew Zisserman

Figure 1 for BootsTAP: Bootstrapped Training for Tracking-Any-Point

Figure 2 for BootsTAP: Bootstrapped Training for Tracking-Any-Point

Figure 3 for BootsTAP: Bootstrapped Training for Tracking-Any-Point

Figure 4 for BootsTAP: Bootstrapped Training for Tracking-Any-Point

Share this with someone who'll enjoy it:

Abstract:To endow models with greater understanding of physics and motion, it is useful to enable them to perceive how solid surfaces move and deform in real scenes. This can be formalized as Tracking-Any-Point (TAP), which requires the algorithm to be able to track any point corresponding to a solid surface in a video, potentially densely in space and time. Large-scale ground-truth training data for TAP is only available in simulation, which currently has limited variety of objects and motion. In this work, we demonstrate how large-scale, unlabeled, uncurated real-world data can improve a TAP model with minimal architectural changes, using a self-supervised student-teacher setup. We demonstrate state-of-the-art performance on the TAP-Vid benchmark surpassing previous results by a wide margin: for example, TAP-Vid-DAVIS performance improves from 61.3% to 66.4%, and TAP-Vid-Kinetics from 57.2% to 61.5%.

View paper on

Share this with someone who'll enjoy it:

Title:BootsTAP: Bootstrapped Training for Tracking-Any-Point

Paper and Code