Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:PM-VIS: High-Performance Box-Supervised Video Instance Segmentation

Apr 22, 2024

Zhangjing Yang, Dun Liu, Wensheng Cheng, Jinqiao Wang, Yi Wu

Figure 1 for PM-VIS: High-Performance Box-Supervised Video Instance Segmentation

Figure 2 for PM-VIS: High-Performance Box-Supervised Video Instance Segmentation

Figure 3 for PM-VIS: High-Performance Box-Supervised Video Instance Segmentation

Figure 4 for PM-VIS: High-Performance Box-Supervised Video Instance Segmentation

Share this with someone who'll enjoy it:

Abstract:Labeling pixel-wise object masks in videos is a resource-intensive and laborious process. Box-supervised Video Instance Segmentation (VIS) methods have emerged as a viable solution to mitigate the labor-intensive annotation process. . In practical applications, the two-step approach is not only more flexible but also exhibits a higher recognition accuracy. Inspired by the recent success of Segment Anything Model (SAM), we introduce a novel approach that aims at harnessing instance box annotations from multiple perspectives to generate high-quality instance pseudo masks, thus enriching the information contained in instance annotations. We leverage ground-truth boxes to create three types of pseudo masks using the HQ-SAM model, the box-supervised VIS model (IDOL-BoxInst), and the VOS model (DeAOT) separately, along with three corresponding optimization mechanisms. Additionally, we introduce two ground-truth data filtering methods, assisted by high-quality pseudo masks, to further enhance the training dataset quality and improve the performance of fully supervised VIS methods. To fully capitalize on the obtained high-quality Pseudo Masks, we introduce a novel algorithm, PM-VIS, to integrate mask losses into IDOL-BoxInst. Our PM-VIS model, trained with high-quality pseudo mask annotations, demonstrates strong ability in instance mask prediction, achieving state-of-the-art performance on the YouTube-VIS 2019, YouTube-VIS 2021, and OVIS validation sets, notably narrowing the gap between box-supervised and fully supervised VIS methods.

View paper on

Share this with someone who'll enjoy it:

Title:PM-VIS: High-Performance Box-Supervised Video Instance Segmentation

Paper and Code