Abstract:3D medical image segmentation has progressed considerably due to Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), yet these methods struggle to balance long-range dependency acquisition with computational efficiency. To address this challenge, we propose UNETVL (U-Net Vision-LSTM), a novel architecture that leverages recent advancements in temporal information processing. UNETVL incorporates Vision-LSTM (ViL) for improved scalability and memory functions, alongside an efficient Chebyshev Kolmogorov-Arnold Networks (KAN) to handle complex and long-range dependency patterns more effectively. We validated our method on the ACDC and AMOS2022 (post challenge Task 2) benchmark datasets, showing a significant improvement in mean Dice score compared to recent state-of-the-art approaches, especially over its predecessor, UNETR, with increases of 7.3% on ACDC and 15.6% on AMOS, respectively. Extensive ablation studies were conducted to demonstrate the impact of each component in UNETVL, providing a comprehensive understanding of its architecture. Our code is available at https://github.com/tgrex6/UNETVL, facilitating further research and applications in this domain.
Abstract:Dynamic Contrast Enhanced-Magnetic Resonance Imaging (DCE-MRI) is widely used to complement ultrasound examinations and x-ray mammography during the early detection and diagnosis of breast cancer. However, images generated by various MRI scanners (e.g. GE Healthcare vs Siemens) differ both in intensity and noise distribution, preventing algorithms trained on MRIs from one scanner to generalize to data from other scanners successfully. We propose a method for image normalization to solve this problem. MRI normalization is challenging because it requires both normalizing intensity values and mapping between the noise distributions of different scanners. We utilize a cycle-consistent generative adversarial network to learn a bidirectional mapping between MRIs produced by GE Healthcare and Siemens scanners. This allows us learning the mapping between two different scanner types without matched data, which is not commonly available. To ensure the preservation of breast shape and structures within the breast, we propose two technical innovations. First, we incorporate a mutual information loss with the CycleGAN architecture to ensure that the structure of the breast is maintained. Second, we propose a modified discriminator architecture which utilizes a smaller field-of-view to ensure the preservation of finer details in the breast tissue. Quantitative and qualitative evaluations show that the second proposed method was able to consistently preserve a high level of detail in the breast structure while also performing the proper intensity normalization and noise mapping. Our results demonstrate that the proposed model can successfully learn a bidirectional mapping between MRIs produced by different vendors, potentially enabling improved accuracy of downstream computational algorithms for diagnosis and detection of breast cancer.