Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation

Sep 18, 2023

Bowen Yin, Xuying Zhang, Zhongyu Li, Li Liu, Ming-Ming Cheng, Qibin Hou

Figure 1 for DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation

Figure 2 for DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation

Figure 3 for DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation

Figure 4 for DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation

Share this with someone who'll enjoy it:

Abstract:We present DFormer, a novel RGB-D pretraining framework to learn transferable representations for RGB-D segmentation tasks. DFormer has two new key innovations: 1) Unlike previous works that aim to encode RGB features,DFormer comprises a sequence of RGB-D blocks, which are tailored for encoding both RGB and depth information through a novel building block design; 2) We pre-train the backbone using image-depth pairs from ImageNet-1K, and thus the DFormer is endowed with the capacity to encode RGB-D representations. It avoids the mismatched encoding of the 3D geometry relationships in depth maps by RGB pre-trained backbones, which widely lies in existing methods but has not been resolved. We fine-tune the pre-trained DFormer on two popular RGB-D tasks, i.e., RGB-D semantic segmentation and RGB-D salient object detection, with a lightweight decoder head. Experimental results show that our DFormer achieves new state-of-the-art performance on these two tasks with less than half of the computational cost of the current best methods on two RGB-D segmentation datasets and five RGB-D saliency datasets. Our code is available at: https://github.com/VCIP-RGBD/DFormer.

View paper on

Share this with someone who'll enjoy it:

Title:DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation

Paper and Code