Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:TubeFormer-DeepLab: Video Mask Transformer

May 30, 2022

Dahun Kim, Jun Xie, Huiyu Wang, Siyuan Qiao, Qihang Yu, Hong-Seok Kim, Hartwig Adam, In So Kweon, Liang-Chieh Chen

Figure 1 for TubeFormer-DeepLab: Video Mask Transformer

Figure 2 for TubeFormer-DeepLab: Video Mask Transformer

Figure 3 for TubeFormer-DeepLab: Video Mask Transformer

Figure 4 for TubeFormer-DeepLab: Video Mask Transformer

Share this with someone who'll enjoy it:

Abstract:We present TubeFormer-DeepLab, the first attempt to tackle multiple core video segmentation tasks in a unified manner. Different video segmentation tasks (e.g., video semantic/instance/panoptic segmentation) are usually considered as distinct problems. State-of-the-art models adopted in the separate communities have diverged, and radically different approaches dominate in each task. By contrast, we make a crucial observation that video segmentation tasks could be generally formulated as the problem of assigning different predicted labels to video tubes (where a tube is obtained by linking segmentation masks along the time axis) and the labels may encode different values depending on the target task. The observation motivates us to develop TubeFormer-DeepLab, a simple and effective video mask transformer model that is widely applicable to multiple video segmentation tasks. TubeFormer-DeepLab directly predicts video tubes with task-specific labels (either pure semantic categories, or both semantic categories and instance identities), which not only significantly simplifies video segmentation models, but also advances state-of-the-art results on multiple video segmentation benchmarks

* CVPR 2022

View paper on

Share this with someone who'll enjoy it:

Title:TubeFormer-DeepLab: Video Mask Transformer

Paper and Code