Abstract:Spatial transcriptomics (ST) is an emerging technology that enables medical computer vision scientists to automatically interpret the molecular profiles underlying morphological features. Currently, however, most deep learning-based ST analyses are limited to two-dimensional (2D) sections, which can introduce diagnostic errors due to the heterogeneity of pathological tissues across 3D sections. Expanding ST to three-dimensional (3D) volumes is challenging due to the prohibitive costs; a 2D ST acquisition already costs over 50 times more than whole slide imaging (WSI), and a full 3D volume with 10 sections can be an order of magnitude more expensive. To reduce costs, scientists have attempted to predict ST data directly from WSI without performing actual ST acquisition. However, these methods typically yield unsatisfying results. To address this, we introduce a novel problem setting: 3D ST imputation using 3D WSI histology sections combined with a single 2D ST slide. To do so, we present the Anatomy-aware Spatial Imputation Graph Network (ASIGN) for more precise, yet affordable, 3D ST modeling. The ASIGN architecture extends existing 2D spatial relationships into 3D by leveraging cross-layer overlap and similarity-based expansion. Moreover, a multi-level spatial attention graph network integrates features comprehensively across different data sources. We evaluated ASIGN on three public spatial transcriptomics datasets, with experimental results demonstrating that ASIGN achieves state-of-the-art performance on both 2D and 3D scenarios. Code is available at https://github.com/hrlblab/ASIGN.
Abstract:Artificial intelligence (AI) has demonstrated significant success in automating the detection of glomeruli, the key functional units of the kidney, from whole slide images (WSIs) in kidney pathology. However, existing open-source tools are often distributed as source code or Docker containers, requiring advanced programming skills that hinder accessibility for non-programmers, such as clinicians. Additionally, current models are typically trained on a single dataset and lack flexibility in adjusting confidence levels for predictions. To overcome these challenges, we introduce GloFinder, a QuPath plugin designed for single-click automated glomeruli detection across entire WSIs with online editing through the graphical user interface (GUI). GloFinder employs CircleNet, an anchor-free detection framework utilizing circle representations for precise object localization, with models trained on approximately 160,000 manually annotated glomeruli. To further enhance accuracy, the plugin incorporates Weighted Circle Fusion (WCF), an ensemble method that combines confidence scores from multiple CircleNet models to produce refined predictions, achieving superior performance in glomerular detection. GloFinder enables direct visualization and editing of results in QuPath, facilitating seamless interaction for clinicians and providing a powerful tool for nephropathology research and clinical practice.
Abstract:Segmenting glomerular intraglomerular tissue and lesions traditionally depends on detailed morphological evaluations by expert nephropathologists, a labor-intensive process susceptible to interobserver variability. Our group previously developed the Glo-In-One toolkit for integrated detection and segmentation of glomeruli. In this study, we leverage the Glo-In-One toolkit to version 2 with fine-grained segmentation capabilities, curating 14 distinct labels for tissue regions, cells, and lesions across a dataset of 23,529 annotated glomeruli across human and mouse histopathology data. To our knowledge, this dataset is among the largest of its kind to date.In this study, we present a single dynamic head deep learning architecture designed to segment 14 classes within partially labeled images of human and mouse pathology data. Our model was trained using a training set derived from 368 annotated kidney whole-slide images (WSIs) to identify 5 key intraglomerular tissues covering Bowman's capsule, glomerular tuft, mesangium, mesangial cells, and podocytes. Additionally, the network segments 9 glomerular lesion classes including adhesion, capsular drop, global sclerosis, hyalinosis, mesangial lysis, microaneurysm, nodular sclerosis, mesangial expansion, and segmental sclerosis. The glomerulus segmentation model achieved a decent performance compared with baselines, and achieved a 76.5 % average Dice Similarity Coefficient (DSC). Additional, transfer learning from rodent to human for glomerular lesion segmentation model has enhanced the average segmentation accuracy across different types of lesions by more than 3 %, as measured by Dice scores. The Glo-In-One-v2 model and trained weight have been made publicly available at https: //github.com/hrlblab/Glo-In-One_v2.
Abstract:Chronic rhinosinusitis (CRS) is characterized by persistent inflammation in the paranasal sinuses, leading to typical symptoms of nasal congestion, facial pressure, olfactory dysfunction, and discolored nasal drainage, which can significantly impact quality-of-life. Eosinophils (Eos), a crucial component in the mucosal immune response, have been linked to disease severity in CRS. The diagnosis of eosinophilic CRS typically uses a threshold of 10-20 eos per high-power field (HPF). However, manually counting Eos in histological samples is laborious and time-intensive, making the use of AI-driven methods for automated evaluations highly desirable. Interestingly, eosinophils are predominantly located in the gastrointestinal (GI) tract, which has prompted the release of numerous deep learning models trained on GI data. This study leverages a CircleSnake model initially trained on upper-GI data to segment Eos cells in whole slide images (WSIs) of nasal tissues. It aims to determine the extent to which Eos segmentation models developed for the GI tract can be adapted to nasal applications without retraining. The experimental results show promising accuracy in some WSIs, although, unsurprisingly, the performance varies across cases. This paper details these performance outcomes, delves into the reasons for such variations, and aims to provide insights that could guide future development of deep learning models for eosinophilic CRS.
Abstract:Training AI foundation models has emerged as a promising large-scale learning approach for addressing real-world healthcare challenges, including digital pathology. While many of these models have been developed for tasks like disease diagnosis and tissue quantification using extensive and diverse training datasets, their readiness for deployment on some arguably simplest tasks, such as nuclei segmentation within a single organ (e.g., the kidney), remains uncertain. This paper seeks to answer this key question, "How good are we?", by thoroughly evaluating the performance of recent cell foundation models on a curated multi-center, multi-disease, and multi-species external testing dataset. Additionally, we tackle a more challenging question, "How can we improve?", by developing and assessing human-in-the-loop data enrichment strategies aimed at enhancing model performance while minimizing the reliance on pixel-level human annotation. To address the first question, we curated a multicenter, multidisease, and multispecies dataset consisting of 2,542 kidney whole slide images (WSIs). Three state-of-the-art (SOTA) cell foundation models-Cellpose, StarDist, and CellViT-were selected for evaluation. To tackle the second question, we explored data enrichment algorithms by distilling predictions from the different foundation models with a human-in-the-loop framework, aiming to further enhance foundation model performance with minimal human efforts. Our experimental results showed that all three foundation models improved over their baselines with model fine-tuning with enriched data. Interestingly, the baseline model with the highest F1 score does not yield the best segmentation outcomes after fine-tuning. This study establishes a benchmark for the development and deployment of cell vision foundation models tailored for real-world data applications.
Abstract:Estimated brain age from magnetic resonance image (MRI) and its deviation from chronological age can provide early insights into potential neurodegenerative diseases, supporting early detection and implementation of prevention strategies. Diffusion MRI (dMRI), a widely used modality for brain age estimation, presents an opportunity to build an earlier biomarker for neurodegenerative disease prediction because it captures subtle microstructural changes that precede more perceptible macrostructural changes. However, the coexistence of macro- and micro-structural information in dMRI raises the question of whether current dMRI-based brain age estimation models are leveraging the intended microstructural information or if they inadvertently rely on the macrostructural information. To develop a microstructure-specific brain age, we propose a method for brain age identification from dMRI that minimizes the model's use of macrostructural information by non-rigidly registering all images to a standard template. Imaging data from 13,398 participants across 12 datasets were used for the training and evaluation. We compare our brain age models, trained with and without macrostructural information minimized, with an architecturally similar T1-weighted (T1w) MRI-based brain age model and two state-of-the-art T1w MRI-based brain age models that primarily use macrostructural information. We observe difference between our dMRI-based brain age and T1w MRI-based brain age across stages of neurodegeneration, with dMRI-based brain age being older than T1w MRI-based brain age in participants transitioning from cognitively normal (CN) to mild cognitive impairment (MCI), but younger in participants already diagnosed with Alzheimer's disease (AD). Approximately 4 years before MCI diagnosis, dMRI-based brain age yields better performance than T1w MRI-based brain ages in predicting transition from CN to MCI.
Abstract:Endoscopy is a crucial tool for diagnosing the gastrointestinal tract, but its effectiveness is often limited by a narrow field of view and the dynamic nature of the internal environment, especially in the esophagus, where complex and repetitive patterns make image stitching challenging. This paper introduces a novel automatic image unfolding and stitching framework tailored for esophageal videos captured during endoscopy. The method combines feature matching algorithms, including LoFTR, SIFT, and ORB, to create a feature filtering pool and employs a Density-Weighted Homography Optimization (DWHO) algorithm to enhance stitching accuracy. By merging consecutive frames, the framework generates a detailed panoramic view of the esophagus, enabling thorough and accurate visual analysis. Experimental results show the framework achieves low Root Mean Square Error (RMSE) and high Structural Similarity Index (SSIM) across extensive video sequences, demonstrating its potential for clinical use and improving the quality and continuity of endoscopic visual data.
Abstract:An incomplete field-of-view (FOV) in diffusion magnetic resonance imaging (dMRI) can severely hinder the volumetric and bundle analyses of whole-brain white matter connectivity. Although existing works have investigated imputing the missing regions using deep generative models, it remains unclear how to specifically utilize additional information from paired multi-modality data and whether this can enhance the imputation quality and be useful for downstream tractography. To fill this gap, we propose a novel framework for imputing dMRI scans in the incomplete part of the FOV by integrating the learned diffusion features in the acquired part of the FOV to the complete brain anatomical structure. We hypothesize that by this design the proposed framework can enhance the imputation performance of the dMRI scans and therefore be useful for repairing whole-brain tractography in corrupted dMRI scans with incomplete FOV. We tested our framework on two cohorts from different sites with a total of 96 subjects and compared it with a baseline imputation method that treats the information from T1w and dMRI scans equally. The proposed framework achieved significant improvements in imputation performance, as demonstrated by angular correlation coefficient (p < 1E-5), and in downstream tractography accuracy, as demonstrated by Dice score (p < 0.01). Results suggest that the proposed framework improved imputation performance in dMRI scans by specifically utilizing additional information from paired multi-modality data, compared with the baseline method. The imputation achieved by the proposed framework enhances whole brain tractography, and therefore reduces the uncertainty when analyzing bundles associated with neurodegenerative.
Abstract:Multimodal fusion promises better pancreas segmentation. However, where to perform fusion in models is still an open question. It is unclear if there is a best location to fuse information when analyzing pairs of imperfectly aligned images. Two main alignment challenges in this pancreas segmentation study are 1) the pancreas is deformable and 2) breathing deforms the abdomen. Even after image registration, relevant deformations are often not corrected. We examine how early through late fusion impacts pancreas segmentation. We used 353 pairs of T2-weighted (T2w) and T1-weighted (T1w) abdominal MR images from 163 subjects with accompanying pancreas labels. We used image registration (deeds) to align the image pairs. We trained a collection of basic UNets with different fusion points, spanning from early to late, to assess how early through late fusion influenced segmentation performance on imperfectly aligned images. We assessed generalization of fusion points on nnUNet. The single-modality T2w baseline using a basic UNet model had a Dice score of 0.73, while the same baseline on the nnUNet model achieved 0.80. For the basic UNet, the best fusion approach occurred in the middle of the encoder (early/mid fusion), which led to a statistically significant improvement of 0.0125 on Dice score compared to the baseline. For the nnUNet, the best fusion approach was na\"ive image concatenation before the model (early fusion), which resulted in a statistically significant Dice score increase of 0.0021 compared to baseline. Fusion in specific blocks can improve performance, but the best blocks for fusion are model specific, and the gains are small. In imperfectly registered datasets, fusion is a nuanced problem, with the art of design remaining vital for uncovering potential insights. Future innovation is needed to better address fusion in cases of imperfect alignment of abdominal image pairs.
Abstract:Ultrahigh field (UHF) Magnetic Resonance Imaging (MRI) provides a higher signal-to-noise ratio and, thereby, higher spatial resolution. However, UHF MRI introduces challenges such as transmit radiofrequency (RF) field (B1+) inhomogeneities, leading to uneven flip angles and image intensity anomalies. These issues can significantly degrade imaging quality and its medical applications. This study addresses B1+ field homogeneity through a novel deep learning-based strategy. Traditional methods like Magnitude Least Squares (MLS) optimization have been effective but are time-consuming and dependent on the patient's presence. Recent machine learning approaches, such as RF Shim Prediction by Iteratively Projected Ridge Regression and deep learning frameworks, have shown promise but face limitations like extensive training times and oversimplified architectures. We propose a two-step deep learning strategy. First, we obtain the desired reference RF shimming weights from multi-channel B1+ fields using random-initialized Adaptive Moment Estimation. Then, we employ Residual Networks (ResNets) to train a model that maps B1+ fields to target RF shimming outputs. Our approach does not rely on pre-calculated reference optimizations for the testing process and efficiently learns residual functions. Comparative studies with traditional MLS optimization demonstrate our method's advantages in terms of speed and accuracy. The proposed strategy achieves a faster and more efficient RF shimming design, significantly improving imaging quality at UHF. This advancement holds potential for broader applications in medical imaging and diagnostics.