Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:TANet: Transformer-based Asymmetric Network for RGB-D Salient Object Detection

Jul 04, 2022

Chang Liu, Gang Yang, Shuo Wang, Hangxu Wang, Yunhua Zhang, Yutao Wang

Figure 1 for TANet: Transformer-based Asymmetric Network for RGB-D Salient Object Detection

Figure 2 for TANet: Transformer-based Asymmetric Network for RGB-D Salient Object Detection

Figure 3 for TANet: Transformer-based Asymmetric Network for RGB-D Salient Object Detection

Figure 4 for TANet: Transformer-based Asymmetric Network for RGB-D Salient Object Detection

Share this with someone who'll enjoy it:

Abstract:Existing RGB-D SOD methods mainly rely on a symmetric two-stream CNN-based network to extract RGB and depth channel features separately. However, there are two problems with the symmetric conventional network structure: first, the ability of CNN in learning global contexts is limited; second, the symmetric two-stream structure ignores the inherent differences between modalities. In this paper, we propose a Transformer-based asymmetric network (TANet) to tackle the issues mentioned above. We employ the powerful feature extraction capability of Transformer (PVTv2) to extract global semantic information from RGB data and design a lightweight CNN backbone (LWDepthNet) to extract spatial structure information from depth data without pre-training. The asymmetric hybrid encoder (AHE) effectively reduces the number of parameters in the model while increasing speed without sacrificing performance. Then, we design a cross-modal feature fusion module (CMFFM), which enhances and fuses RGB and depth features with each other. Finally, we add edge prediction as an auxiliary task and propose an edge enhancement module (EEM) to generate sharper contours. Extensive experiments demonstrate that our method achieves superior performance over 14 state-of-the-art RGB-D methods on six public datasets. Our code will be released at https://github.com/lc012463/TANet.

View paper on

Share this with someone who'll enjoy it:

Title:TANet: Transformer-based Asymmetric Network for RGB-D Salient Object Detection

Paper and Code