Abstract:The success of deep learning in intelligent ship visual perception relies heavily on rich image data. However, dedicated datasets for inland waterway vessels remain scarce, limiting the adaptability of visual perception systems in complex environments. Inland waterways, characterized by narrow channels, variable weather, and urban interference, pose significant challenges to object detection systems based on existing datasets. To address these issues, this paper introduces the Multi-environment Inland Waterway Vessel Dataset (MEIWVD), comprising 32,478 high-quality images from diverse scenarios, including sunny, rainy, foggy, and artificial lighting conditions. MEIWVD covers common vessel types in the Yangtze River Basin, emphasizing diversity, sample independence, environmental complexity, and multi-scale characteristics, making it a robust benchmark for vessel detection. Leveraging MEIWVD, this paper proposes a scene-guided image enhancement module to improve water surface images based on environmental conditions adaptively. Additionally, a parameter-limited dilated convolution enhances the representation of vessel features, while a multi-scale dilated residual fusion method integrates multi-scale features for better detection. Experiments show that MEIWVD provides a more rigorous benchmark for object detection algorithms, and the proposed methods significantly improve detector performance, especially in complex multi-environment scenarios.
Abstract:Accurate automated extraction of brain vessel centerlines from CTA images plays an important role in diagnosis and therapy of cerebrovascular diseases, such as stroke. However, this task remains challenging due to the complex cerebrovascular structure, the varying imaging quality, and vessel pathology effects. In this paper, we consider automatic lumen segmentation generation without additional annotation effort by physicians and more effective use of the generated lumen segmentation for improved centerline extraction performance. We propose an automated framework for brain vessel centerline extraction from CTA images. The framework consists of four major components: (1) pre-processing approaches that register CTA images with a CT atlas and divide these images into input patches, (2) lumen segmentation generation from annotated vessel centerlines using graph cuts and robust kernel regression, (3) a dual-branch topology-aware UNet (DTUNet) that can effectively utilize the annotated vessel centerlines and the generated lumen segmentation through a topology-aware loss (TAL) and its dual-branch design, and (4) post-processing approaches that skeletonize the predicted lumen segmentation. Extensive experiments on a multi-center dataset demonstrate that the proposed framework outperforms state-of-the-art methods in terms of average symmetric centerline distance (ASCD) and overlap (OV). Subgroup analyses further suggest that the proposed framework holds promise in clinical applications for stroke treatment. Code is publicly available at https://github.com/Liusj-gh/DTUNet.
Abstract:Transformer is beneficial for image denoising tasks since it can model long-range dependencies to overcome the limitations presented by inductive convolutional biases. However, directly applying the transformer structure to remove noise is challenging because its complexity grows quadratically with the spatial resolution. In this paper, we propose an efficient Dual-branch Deformable Transformer (DDT) denoising network which captures both local and global interactions in parallel. We divide features with a fixed patch size and a fixed number of patches in local and global branches, respectively. In addition, we apply deformable attention operation in both branches, which helps the network focus on more important regions and further reduces computational complexity. We conduct extensive experiments on real-world and synthetic denoising tasks, and the proposed DDT achieves state-of-the-art performance with significantly fewer computational costs.