Abstract:This study introduces Polyp-DDPM, a diffusion-based method for generating realistic images of polyps conditioned on masks, aimed at enhancing the segmentation of gastrointestinal (GI) tract polyps. Our approach addresses the challenges of data limitations, high annotation costs, and privacy concerns associated with medical images. By conditioning the diffusion model on segmentation masks-binary masks that represent abnormal areas-Polyp-DDPM outperforms state-of-the-art methods in terms of image quality (achieving a Frechet Inception Distance (FID) score of 78.47, compared to scores above 83.79) and segmentation performance (achieving an Intersection over Union (IoU) of 0.7156, versus less than 0.6694 for synthetic images from baseline models and 0.7067 for real data). Our method generates a high-quality, diverse synthetic dataset for training, thereby enhancing polyp segmentation models to be comparable with real images and offering greater data augmentation capabilities to improve segmentation models. The source code and pretrained weights for Polyp-DDPM are made publicly available at https://github.com/mobaidoctor/polyp-ddpm.
Abstract:This paper introduces Med-DDPM, an innovative solution using diffusion models for semantic 3D medical image synthesis, addressing the prevalent issues in medical imaging such as data scarcity, inconsistent acquisition methods, and privacy concerns. Experimental evidence illustrates that diffusion models surpass Generative Adversarial Networks (GANs) in stability and performance, generating high-quality, realistic 3D medical images. The distinct feature of Med-DDPM is its use of semantic conditioning for the diffusion model in 3D image synthesis. By controlling the generation process through pixel-level mask labels, it facilitates the creation of realistic medical images. Empirical evaluations underscore the superior performance of Med-DDPM over GAN techniques in metrics such as accuracy, stability, and versatility. Furthermore, Med-DDPM outperforms traditional augmentation techniques and synthetic GAN images in enhancing the accuracy of segmentation models. It addresses challenges such as insufficient datasets, lack of annotated data, and class imbalance. Noting the limitations of the Frechet inception distance (FID) metric, we introduce a histogram-equalized FID metric for effective performance evaluation. In summary, Med-DDPM, by utilizing diffusion models, signifies a crucial step forward in the domain of high-resolution semantic 3D medical image synthesis, transcending the limitations of GANs and data constraints. This method paves the way for a promising solution in medical imaging, primarily for data augmentation and anonymization, thus contributing significantly to the field.
Abstract:Semantic segmentation of medical images with deep learning models is rapidly developed. In this study, we benchmarked state-of-the-art deep learning segmentation algorithms on our clinical stereotactic radiosurgery dataset, demonstrating the strengths and weaknesses of these algorithms in a fairly practical scenario. In particular, we compared the model performances with respect to their sampling method, model architecture, and the choice of loss functions, identifying the suitable settings for their applications and shedding light on the possible improvements.
Abstract:Stereotactic radiosurgery (SRS), which delivers high doses of irradiation in a single or few shots to small targets, has been a standard of care for brain metastases. While very effective, SRS currently requires manually intensive delineation of tumors. In this work, we present a deep learning approach for automated detection and segmentation of brain metastases using multimodal imaging and ensemble neural networks. In order to address small and multiple brain metastases, we further propose a volume-aware Dice loss which optimizes model performance using the information of lesion size. This work surpasses current benchmark levels and demonstrates a reliable AI-assisted system for SRS treatment planning for multiple brain metastases.
Abstract:Brain CT has become a standard imaging tool for emergent evaluation of brain condition, and measurement of midline shift (MLS) is one of the most important features to address for brain CT assessment. We present a simple method to estimate MLS and propose a new alternative parameter to MLS: the ratio of MLS over the maximal width of intracranial region (MLS/ICWMAX). Three neurosurgeons and our automated system were asked to measure MLS and MLS/ICWMAX in the same sets of axial CT images obtained from 41 patients admitted to ICU under neurosurgical service. A weighted midline (WML) was plotted based on individual pixel intensities, with higher weighted given to the darker portions. The MLS could then be measured as the distance between the WML and ideal midline (IML) near the foramen of Monro. The average processing time to output an automatic MLS measurement was around 10 seconds. Our automated system achieved an overall accuracy of 90.24% when the CT images were calibrated automatically, and performed better when the calibrations of head rotation were done manually (accuracy: 92.68%). MLS/ICWMAX and MLS both gave results in same confusion matrices and produced similar ROC curve results. We demonstrated a simple, fast and accurate automated system of MLS measurement and introduced a new parameter (MLS/ICWMAX) as a good alternative to MLS in terms of estimating the degree of brain deformation, especially when non-DICOM images (e.g. JPEG) are more easily accessed.