Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:ReferEverything: Towards Segmenting Everything We Can Speak of in Videos

Oct 30, 2024

Anurag Bagchi, Zhipeng Bao, Yu-Xiong Wang, Pavel Tokmakov, Martial Hebert

Figure 1 for ReferEverything: Towards Segmenting Everything We Can Speak of in Videos

Figure 2 for ReferEverything: Towards Segmenting Everything We Can Speak of in Videos

Figure 3 for ReferEverything: Towards Segmenting Everything We Can Speak of in Videos

Figure 4 for ReferEverything: Towards Segmenting Everything We Can Speak of in Videos

Share this with someone who'll enjoy it:

Abstract:We present REM, a framework for segmenting a wide range of concepts in video that can be described through natural language. Our method capitalizes on visual-language representations learned by video diffusion models on Internet-scale datasets. A key insight of our approach is preserving as much of the generative model's original representation as possible, while fine-tuning it on narrow-domain Referral Object Segmentation datasets. As a result, our framework can accurately segment and track rare and unseen objects, despite being trained on object masks from a limited set of categories. Additionally, it can generalize to non-object dynamic concepts, such as waves crashing in the ocean, as demonstrated in our newly introduced benchmark for Referral Video Process Segmentation (Ref-VPS). Our experiments show that REM performs on par with state-of-the-art approaches on in-domain datasets, like Ref-DAVIS, while outperforming them by up to twelve points in terms of region similarity on out-of-domain data, leveraging the power of Internet-scale pre-training.

* Project page at https://miccooper9.github.io/projects/ReferEverything/

View paper on

Share this with someone who'll enjoy it:

Title:ReferEverything: Towards Segmenting Everything We Can Speak of in Videos

Paper and Code