Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation

Jul 21, 2022

Guolei Sun, Yun Liu, Hao Tang, Ajad Chhatkuli, Le Zhang, Luc Van Gool

Figure 1 for Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation

Figure 2 for Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation

Figure 3 for Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation

Figure 4 for Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation

Share this with someone who'll enjoy it:

Abstract:The essence of video semantic segmentation (VSS) is how to leverage temporal information for prediction. Previous efforts are mainly devoted to developing new techniques to calculate the cross-frame affinities such as optical flow and attention. Instead, this paper contributes from a different angle by mining relations among cross-frame affinities, upon which better temporal information aggregation could be achieved. We explore relations among affinities in two aspects: single-scale intrinsic correlations and multi-scale relations. Inspired by traditional feature processing, we propose Single-scale Affinity Refinement (SAR) and Multi-scale Affinity Aggregation (MAA). To make it feasible to execute MAA, we propose a Selective Token Masking (STM) strategy to select a subset of consistent reference tokens for different scales when calculating affinities, which also improves the efficiency of our method. At last, the cross-frame affinities strengthened by SAR and MAA are adopted for adaptively aggregating temporal information. Our experiments demonstrate that the proposed method performs favorably against state-of-the-art VSS methods. The code is publicly available at https://github.com/GuoleiSun/VSS-MRCFA

* Accepted to ECCV 2022

View paper on

Share this with someone who'll enjoy it:

Title:Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation

Paper and Code