on behalf of CURE-CKD
Abstract:Randomized controlled trials (RCTs) are the standard for evaluating the effectiveness of clinical interventions. To address the limitations of RCTs on real-world populations, we developed a methodology that uses a large observational electronic health record (EHR) dataset. Principles of regression discontinuity (rd) were used to derive randomized data subsets to test expert-driven interventions using dynamic Bayesian Networks (DBNs) do-operations. This combined method was applied to a chronic kidney disease (CKD) cohort of more than two million individuals and used to understand the associational and causal relationships of CKD variables with respect to a surrogate outcome of >=40% decline in estimated glomerular filtration rate (eGFR). The associational and causal analyses depicted similar findings across DBNs from two independent healthcare systems. The associational analysis showed that the most influential variables were eGFR, urine albumin-to-creatinine ratio, and pulse pressure, whereas the causal analysis showed eGFR as the most influential variable, followed by modifiable factors such as medications that may impact kidney function over time. This methodology demonstrates how real-world EHR data can be used to provide population-level insights to inform improved healthcare delivery.
Abstract:Several algorithms for learning the structure of dynamic Bayesian networks (DBNs) require an a priori ordering of variables, which influences the determined graph topology. However, it is often unclear how to determine this order if feature importance is unknown, especially as an exhaustive search is usually impractical. In this paper, we introduce Ranking Approaches for Unknown Structures (RAUS), an automated framework to systematically inform variable ordering and learn networks end-to-end. RAUS leverages existing statistical methods (Cramers V, chi-squared test, and information gain) to compare variable ordering, resultant generated network topologies, and DBN performance. RAUS enables end-users with limited DBN expertise to implement models via command line interface. We evaluate RAUS on the task of predicting impending acute kidney injury (AKI) from inpatient clinical laboratory data. Longitudinal observations from 67,460 patients were collected from our electronic health record (EHR) and Kidney Disease Improving Global Outcomes (KDIGO) criteria were then applied to define AKI events. RAUS learns multiple DBNs simultaneously to predict a future AKI event at different time points (i.e., 24-, 48-, 72-hours in advance of AKI). We also compared the results of the learned AKI prediction models and variable orderings to baseline techniques (logistic regression, random forests, and extreme gradient boosting). The DBNs generated by RAUS achieved 73-83% area under the receiver operating characteristic curve (AUCROC) within 24-hours before AKI; and 71-79% AUCROC within 48-hours before AKI of any stage in a 7-day observation window. Insights from this automated framework can help efficiently implement and interpret DBNs for clinical decision support. The source code for RAUS is available in GitHub at https://github.com/dgrdn08/RAUS .
Abstract:While deep learning methods are increasingly being applied to tasks such as computer-aided diagnosis, these models are difficult to interpret, do not incorporate prior domain knowledge, and are often considered as a "black-box." The lack of model interpretability hinders them from being fully understood by target users such as radiologists. In this paper, we present a novel interpretable deep hierarchical semantic convolutional neural network (HSCNN) to predict whether a given pulmonary nodule observed on a computed tomography (CT) scan is malignant. Our network provides two levels of output: 1) low-level radiologist semantic features, and 2) a high-level malignancy prediction score. The low-level semantic outputs quantify the diagnostic features used by radiologists and serve to explain how the model interprets the images in an expert-driven manner. The information from these low-level tasks, along with the representations learned by the convolutional layers, are then combined and used to infer the high-level task of predicting nodule malignancy. This unified architecture is trained by optimizing a global loss function including both low- and high-level tasks, thereby learning all the parameters within a joint framework. Our experimental results using the Lung Image Database Consortium (LIDC) show that the proposed method not only produces interpretable lung cancer predictions but also achieves significantly better results compared to common 3D CNN approaches.