Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Asoke Nandi

CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation

Jun 06, 2023

Tao Lei, Rui Sun, Xuan Wang, Yingbo Wang, Xi He, Asoke Nandi

Figure 1 for CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation

Figure 2 for CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation

Figure 3 for CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation

Figure 4 for CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation

Abstract:The hybrid architecture of convolutional neural networks (CNNs) and Transformer are very popular for medical image segmentation. However, it suffers from two challenges. First, although a CNNs branch can capture the local image features using vanilla convolution, it cannot achieve adaptive feature learning. Second, although a Transformer branch can capture the global features, it ignores the channel and cross-dimensional self-attention, resulting in a low segmentation accuracy on complex-content images. To address these challenges, we propose a novel hybrid architecture of convolutional neural networks hand in hand with vision Transformers (CiT-Net) for medical image segmentation. Our network has two advantages. First, we design a dynamic deformable convolution and apply it to the CNNs branch, which overcomes the weak feature extraction ability due to fixed-size convolution kernels and the stiff design of sharing kernel parameters among different inputs. Second, we design a shifted-window adaptive complementary attention module and a compact convolutional projection. We apply them to the Transformer branch to learn the cross-dimensional long-term dependency for medical images. Experimental results show that our CiT-Net provides better medical image segmentation results than popular SOTA methods. Besides, our CiT-Net requires lower parameters and less computational costs and does not rely on pre-training. The code is publicly available at https://github.com/SR0920/CiT-Net.

Via

Access Paper or Ask Questions