Abstract:In the last few years, deep neural networks have been extensively applied in the medical domain for different tasks, ranging from image classification and segmentation to landmark detection. However, the application of these technologies in the medical domain is often hindered by data scarcity, both in terms of available annotations and images. This study introduces a new self-supervised pre-training protocol based on diffusion models for landmark detection in x-ray images. Our results show that the proposed self-supervised framework can provide accurate landmark detection with a minimal number of available annotated training images (up to 50), outperforming ImageNet supervised pre-training and state-of-the-art self-supervised pre-trainings for three popular x-ray benchmark datasets. To our knowledge, this is the first exploration of diffusion models for self-supervised learning in landmark detection, which may offer a valuable pre-training approach in few-shot regimes, for mitigating data scarcity.
Abstract:In recent years, deep learning has emerged as a promising technique for medical image analysis. However, this application domain is likely to suffer from a limited availability of large public datasets and annotations. A common solution to these challenges in deep learning is the usage of a transfer learning framework, typically with a fine-tuning protocol, where a large-scale source dataset is used to pre-train a model, further fine-tuned on the target dataset. In this paper, we present a systematic study analyzing whether the usage of small-scale in-domain x-ray image datasets may provide any improvement for landmark detection over models pre-trained on large natural image datasets only. We focus on the multi-landmark localization task for three datasets, including chest, head, and hand x-ray images. Our results show that using in-domain source datasets brings marginal or no benefit with respect to an ImageNet out-of-domain pre-training. Our findings can provide an indication for the development of robust landmark detection systems in medical images when no large annotated dataset is available.