Abstract:Deep learning-based segmentation of the liver and hepatic lesions therein steadily gains relevance in clinical practice due to the increasing incidence of liver cancer each year. Whereas various network variants with overall promising results in the field of medical image segmentation have been developed over the last years, almost all of them struggle with the challenge of accurately segmenting hepatic lesions. This lead to the idea of combining elements of convolutional and transformerbased architectures to overcome the existing limitations. This work presents a hybrid network called SWTR-Unet, consisting of a pretrained ResNet, transformer blocks as well as a common Unet-style decoder path. This network was applied to clinical liver MRI, as well as to the publicly available CT data of the liver tumor segmentation (LiTS) challenge. Additionally, multiple state-of-the-art networks were implemented and applied to both datasets, ensuring a direct comparability. Furthermore, correlation analysis and an ablation study were carried out, to investigate various influencing factors on the segmentation accuracy of our presented method. With Dice similarity scores of averaged 98 +- 2 % for liver and 81 +- 28 % lesion segmentation on the MRI dataset and 97 +- 2 % and 79 +- 25 %, respectively on the CT dataset, the proposed SWTR-Unet outperforms each of the additionally implemented state-of-the-art networks. The achieved segmentation accuracy was found to be on par with manually performed expert segmentations as indicated by interobserver variabilities for liver lesion segmentation. In conclusion, the presented method could save valuable time and resources in clinical practice.