Abstract:In clinical practice, we often see significant delays between MRI scans and the diagnosis made by radiologists, even for severe cases. In some cases, this may be caused by the lack of additional information and clues, so even the severe cases need to wait in the queue for diagnosis. This can be avoided if there is an automatic software tool, which would supplement additional information, alerting radiologists that the particular patient may be a severe case. We are presenting an automatic brain MRI Screening Tool and we are demonstrating its capabilities for detecting tumor-like pathologies. It is the first version on the path toward a robust multi-pathology screening solution. The tool supports Federated Learning, so multiple institutions may contribute to the model without disclosing their private data.
Abstract:Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing.