Abstract:Accurate detection and segmentation of diffuse large B-cell lymphoma (DLBCL) from PET images has important implications for estimation of total metabolic tumor volume, radiomics analysis, surgical intervention and radiotherapy. Manual segmentation of tumors in whole-body PET images is time-consuming, labor-intensive and operator-dependent. In this work, we develop and validate a fast and efficient three-step cascaded deep learning model for automated detection and segmentation of DLBCL tumors from PET images. As compared to a single end-to-end network for segmentation of tumors in whole-body PET images, our three-step model is more effective (improves 3D Dice score from 58.9% to 78.1%) since each of its specialized modules, namely the slice classifier, the tumor detector and the tumor segmentor, can be trained independently to a high degree of skill to carry out a specific task, rather than a single network with suboptimal performance on overall segmentation.
Abstract:This study performs comprehensive evaluation of four neural network architectures (UNet, SegResNet, DynUNet, and SwinUNETR) for lymphoma lesion segmentation from PET/CT images. These networks were trained, validated, and tested on a diverse, multi-institutional dataset of 611 cases. Internal testing (88 cases; total metabolic tumor volume (TMTV) range [0.52, 2300] ml) showed SegResNet as the top performer with a median Dice similarity coefficient (DSC) of 0.76 and median false positive volume (FPV) of 4.55 ml; all networks had a median false negative volume (FNV) of 0 ml. On the unseen external test set (145 cases with TMTV range: [0.10, 2480] ml), SegResNet achieved the best median DSC of 0.68 and FPV of 21.46 ml, while UNet had the best FNV of 0.41 ml. We assessed reproducibility of six lesion measures, calculated their prediction errors, and examined DSC performance in relation to these lesion measures, offering insights into segmentation accuracy and clinical relevance. Additionally, we introduced three lesion detection criteria, addressing the clinical need for identifying lesions, counting them, and segmenting based on metabolic characteristics. We also performed expert intra-observer variability analysis revealing the challenges in segmenting ``easy'' vs. ``hard'' cases, to assist in the development of more resilient segmentation algorithms. Finally, we performed inter-observer agreement assessment underscoring the importance of a standardized ground truth segmentation protocol involving multiple expert annotators. Code is available at: https://github.com/microsoft/lymphoma-segmentation-dnn
Abstract:The time-consuming task of manual segmentation challenges routine systematic quantification of disease burden. Convolutional neural networks (CNNs) hold significant promise to reliably identify locations and boundaries of tumors from PET scans. We aimed to leverage the need for annotated data via semi-supervised approaches, with application to PET images of diffuse large B-cell lymphoma (DLBCL) and primary mediastinal large B-cell lymphoma (PMBCL). We analyzed 18F-FDG PET images of 292 patients with PMBCL (n=104) and DLBCL (n=188) (n=232 for training and validation, and n=60 for external testing). We employed FCM and MS losses for training a 3D U-Net with different levels of supervision: i) fully supervised methods with labeled FCM (LFCM) as well as Unified focal and Dice loss functions, ii) unsupervised methods with Robust FCM (RFCM) and Mumford-Shah (MS) loss functions, and iii) Semi-supervised methods based on FCM (RFCM+LFCM), as well as MS loss in combination with supervised Dice loss (MS+Dice). Unified loss function yielded higher Dice score (mean +/- standard deviation (SD)) (0.73 +/- 0.03; 95% CI, 0.67-0.8) compared to Dice loss (p-value<0.01). Semi-supervised (RFCM+alpha*LFCM) with alpha=0.3 showed the best performance, with a Dice score of 0.69 +/- 0.03 (95% CI, 0.45-0.77) outperforming (MS+alpha*Dice) for any supervision level (any alpha) (p<0.01). The best performer among (MS+alpha*Dice) semi-supervised approaches with alpha=0.2 showed a Dice score of 0.60 +/- 0.08 (95% CI, 0.44-0.76) compared to another supervision level in this semi-supervised approach (p<0.01). Semi-supervised learning via FCM loss (RFCM+alpha*LFCM) showed improved performance compared to supervised approaches. Considering the time-consuming nature of expert manual delineations and intra-observer variabilities, semi-supervised approaches have significant potential for automated segmentation workflows.