Abstract:Image segmentation is a vital task for providing human assistance and enhancing autonomy in our daily lives. In particular, RGB-D segmentation-leveraging both visual and depth cues-has attracted increasing attention as it promises richer scene understanding than RGB-only methods. However, most existing efforts have primarily focused on semantic segmentation and thus leave a critical gap. There is a relative scarcity of instance-level RGB-D segmentation datasets, which restricts current methods to broad category distinctions rather than fully capturing the fine-grained details required for recognizing individual objects. To bridge this gap, we introduce three RGB-D instance segmentation benchmarks, distinguished at the instance level. These datasets are versatile, supporting a wide range of applications from indoor navigation to robotic manipulation. In addition, we present an extensive evaluation of various baseline models on these benchmarks. This comprehensive analysis identifies both their strengths and shortcomings, guiding future work toward more robust, generalizable solutions. Finally, we propose a simple yet effective method for RGB-D data integration. Extensive evaluations affirm the effectiveness of our approach, offering a robust framework for advancing toward more nuanced scene understanding.
Abstract:Due to the potential risk of inducing cancers, radiation dose of X-ray CT should be reduced for routine patient scanning. However, in low-dose X-ray CT, severe artifacts usually occur due to photon starvation, beamhardening, etc, which decrease the reliability of diagnosis. Thus, high quality reconstruction from low-dose X-ray CT data has become one of the important research topics in CT community. Conventional model-based denoising approaches are, however, computationally very expensive, and image domain denoising approaches hardly deal with CT specific noise patterns. To address these issues, we propose an algorithm using a deep convolutional neural network (CNN), which is applied to wavelet transform coefficients of low-dose CT images. Specifically, by using a directional wavelet transform for extracting directional component of artifacts and exploiting the intra- and inter-band correlations, our deep network can effectively suppress CT specific noises. Moreover, our CNN is designed to have various types of residual learning architecture for faster network training and better denoising. Experimental results confirm that the proposed algorithm effectively removes complex noise patterns of CT images, originated from the reduced X-ray dose. In addition, we show that wavelet domain CNN is efficient in removing the noises from low-dose CT compared to an image domain CNN. Our results were rigorously evaluated by several radiologists and won the second place award in 2016 AAPM Low-Dose CT Grand Challenge. To the best of our knowledge, this work is the first deep learning architecture for low-dose CT reconstruction that has been rigorously evaluated and proven for its efficacy.