Abstract:Statistical Shape Models (SSMs) excel at identifying population level anatomical variations, which is at the core of various clinical and biomedical applications, including morphology-based diagnostics and surgical planning. However, the effectiveness of SSM is often constrained by the necessity for expert-driven manual segmentation, a process that is both time-intensive and expensive, thereby restricting their broader application and utility. Recent deep learning approaches enable the direct estimation of Statistical Shape Models (SSMs) from unsegmented images. While these models can predict SSMs without segmentation during deployment, they do not address the challenge of acquiring the manual annotations needed for training, particularly in resource-limited settings. Semi-supervised and foundation models for anatomy segmentation can mitigate the annotation burden. Yet, despite the abundance of available approaches, there are no established guidelines to inform end-users on their effectiveness for the downstream task of constructing SSMs. In this study, we systematically evaluate the potential of weakly supervised methods as viable alternatives to manual segmentation's for building SSMs. We establish a new performance benchmark by employing various semi-supervised and foundational model methods for anatomy segmentation under low annotation settings, utilizing the predicted segmentation's for the task of SSM. We compare the modes of shape variation and use quantitative metrics to compare against a shape model derived from a manually annotated dataset. Our results indicate that some methods produce noisy segmentation, which is very unfavorable for SSM tasks, while others can capture the correct modes of variations in the population cohort with 60-80\% reduction in required manual annotation.
Abstract:Statistical Shape Modeling (SSM) is an effective method for quantitatively analyzing anatomical variations within populations. However, its utility is limited by the need for manual segmentations of anatomies, a task that relies on the scarce expertise of medical professionals. Recent advances in deep learning have provided a promising approach that automatically generates statistical representations from unsegmented images. Once trained, these deep learning-based models eliminate the need for manual segmentation for new subjects. Nonetheless, most current methods still require manual pre-alignment of image volumes and specifying a bounding box around the target anatomy prior for inference, resulting in a partially manual inference process. Recent approaches facilitate anatomy localization but only estimate statistical representations at the population level. However, they cannot delineate anatomy directly in images and are limited to modeling a single anatomy. Here, we introduce MASSM, a novel end-to-end deep learning framework that simultaneously localizes multiple anatomies in an image, estimates population-level statistical representations, and delineates each anatomy. Our findings emphasize the crucial role of local correspondences, showcasing their indispensability in providing superior shape information for medical imaging tasks.