Abstract:Modelling ultrasound speckle has generated considerable interest for its ability to characterize tissue properties. As speckle is dependent on the underlying tissue architecture, modelling it may aid in tasks like segmentation or disease detection. However, for the transplanted kidney where ultrasound is commonly used to investigate dysfunction, it is currently unknown which statistical distribution best characterises such speckle. This is especially true for the regions of the transplanted kidney: the cortex, the medulla and the central echogenic complex. Furthermore, it is unclear how these distributions vary by patient variables such as age, sex, body mass index, primary disease, or donor type. These traits may influence speckle modelling given their influence on kidney anatomy. We are the first to investigate these two aims. N=821 kidney transplant recipient B-mode images were automatically segmented into the cortex, medulla, and central echogenic complex using a neural network. Seven distinct probability distributions were fitted to each region. The Rayleigh and Nakagami distributions had model parameters that differed significantly between the three regions (p <= 0.05). While both had excellent goodness of fit, the Nakagami had higher Kullbeck-Leibler divergence. Recipient age correlated weakly with scale in the cortex (Omega: rho = 0.11, p = 0.004), while body mass index correlated weakly with shape in the medulla (m: rho = 0.08, p = 0.04). Neither sex, primary disease, nor donor type demonstrated any correlation. We propose the Nakagami distribution be used to characterize transplanted kidneys regionally independent of disease etiology and most patient characteristics based on our findings.
Abstract:Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing.