Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fei Sheng

Monocular Depth Distribution Alignment with Low Computation

Mar 09, 2022

Fei Sheng, Feng Xue, Yicong Chang, Wenteng Liang, Anlong Ming

Figure 1 for Monocular Depth Distribution Alignment with Low Computation

Figure 2 for Monocular Depth Distribution Alignment with Low Computation

Figure 3 for Monocular Depth Distribution Alignment with Low Computation

Figure 4 for Monocular Depth Distribution Alignment with Low Computation

Abstract:The performance of monocular depth estimation generally depends on the amount of parameters and computational cost. It leads to a large accuracy contrast between light-weight networks and heavy-weight networks, which limits their application in the real world. In this paper, we model the majority of accuracy contrast between them as the difference of depth distribution, which we call "Distribution drift". To this end, a distribution alignment network (DANet) is proposed. We firstly design a pyramid scene transformer (PST) module to capture inter-region interaction in multiple scales. By perceiving the difference of depth features between every two regions, DANet tends to predict a reasonable scene structure, which fits the shape of distribution to ground truth. Then, we propose a local-global optimization (LGO) scheme to realize the supervision of global range of scene depth. Thanks to the alignment of depth distribution shape and scene depth range, DANet sharply alleviates the distribution drift, and achieves a comparable performance with prior heavy-weight methods, but uses only 1% floating-point operations per second (FLOPs) of them. The experiments on two datasets, namely the widely used NYUDv2 dataset and the more challenging iBims-1 dataset, demonstrate the effectiveness of our method. The source code is available at https://github.com/YiLiM1/DANet.

* Accepted by ICRA 2022

Via

Access Paper or Ask Questions

Fast Road Segmentation via Uncertainty-aware Symmetric Network

Mar 09, 2022

Yicong Chang, Feng Xue, Fei Sheng, Wenteng Liang, Anlong Ming

Figure 1 for Fast Road Segmentation via Uncertainty-aware Symmetric Network

Figure 2 for Fast Road Segmentation via Uncertainty-aware Symmetric Network

Figure 3 for Fast Road Segmentation via Uncertainty-aware Symmetric Network

Figure 4 for Fast Road Segmentation via Uncertainty-aware Symmetric Network

Abstract:The high performance of RGB-D based road segmentation methods contrasts with their rare application in commercial autonomous driving, which is owing to two reasons: 1) the prior methods cannot achieve high inference speed and high accuracy in both ways; 2) the different properties of RGB and depth data are not well-exploited, limiting the reliability of predicted road. In this paper, based on the evidence theory, an uncertainty-aware symmetric network (USNet) is proposed to achieve a trade-off between speed and accuracy by fully fusing RGB and depth data. Firstly, cross-modal feature fusion operations, which are indispensable in the prior RGB-D based methods, are abandoned. We instead separately adopt two light-weight subnetworks to learn road representations from RGB and depth inputs. The light-weight structure guarantees the real-time inference of our method. Moreover, a multiscale evidence collection (MEC) module is designed to collect evidence in multiple scales for each modality, which provides sufficient evidence for pixel class determination. Finally, in uncertainty-aware fusion (UAF) module, the uncertainty of each modality is perceived to guide the fusion of the two subnetworks. Experimental results demonstrate that our method achieves a state-of-the-art accuracy with real-time inference speed of 43+ FPS. The source code is available at https://github.com/morancyc/USNet.

* Accepted by ICRA 2022

Via

Access Paper or Ask Questions

Boundary-induced and scene-aggregated network for monocular depth prediction

Feb 26, 2021

Feng Xue, Junfeng Cao, Yu Zhou, Fei Sheng, Yankai Wang, Anlong Ming

Figure 1 for Boundary-induced and scene-aggregated network for monocular depth prediction

Figure 2 for Boundary-induced and scene-aggregated network for monocular depth prediction

Figure 3 for Boundary-induced and scene-aggregated network for monocular depth prediction

Figure 4 for Boundary-induced and scene-aggregated network for monocular depth prediction

Abstract:Monocular depth prediction is an important task in scene understanding. It aims to predict the dense depth of a single RGB image. With the development of deep learning, the performance of this task has made great improvements. However, two issues remain unresolved: (1) The deep feature encodes the wrong farthest region in a scene, which leads to a distorted 3D structure of the predicted depth; (2) The low-level features are insufficient utilized, which makes it even harder to estimate the depth near the edge with sudden depth change. To tackle these two issues, we propose the Boundary-induced and Scene-aggregated network (BS-Net). In this network, the Depth Correlation Encoder (DCE) is first designed to obtain the contextual correlations between the regions in an image, and perceive the farthest region by considering the correlations. Meanwhile, the Bottom-Up Boundary Fusion (BUBF) module is designed to extract accurate boundary that indicates depth change. Finally, the Stripe Refinement module (SRM) is designed to refine the dense depth induced by the boundary cue, which improves the boundary accuracy of the predicted depth. Several experimental results on the NYUD v2 dataset and \xff{the iBims-1 dataset} illustrate the state-of-the-art performance of the proposed approach. And the SUN-RGBD dataset is employed to evaluate the generalization of our method. Code is available at https://github.com/XuefengBUPT/BS-Net.

* Accepted by Pattern Recognition 2021

Via

Access Paper or Ask Questions