Abstract:Problem lists are intended to provide clinicians with a relevant summary of patient medical issues and are embedded in many electronic health record systems. Despite their importance, problem lists are often cluttered with resolved or currently irrelevant conditions. In this work, we develop a novel end-to-end framework that first extracts diagnosis and procedure information from clinical notes and subsequently uses the extracted medical problems to predict patient outcomes. This framework is both more performant and more interpretable than existing models used within the domain, achieving an AU-ROC of 0.710 for bounceback readmission and 0.869 for in-hospital mortality occurring after ICU discharge. We identify risk factors for both readmission and mortality outcomes and demonstrate that our framework can be used to develop dynamic problem lists that present clinical problems along with their quantitative importance. We conduct a qualitative user study with medical experts and demonstrate that they view the lists produced by our framework favorably and find them to be a more effective clinical decision support tool than a strong baseline.
Abstract:Blood pressure monitoring is an essential component of hypertension management and in the prediction of associated comorbidities. Blood pressure is a dynamic vital sign with frequent changes throughout a given day. Capturing blood pressure remotely and frequently (also known as ambulatory blood pressure monitoring) has traditionally been achieved by measuring blood pressure at discrete intervals using an inflatable cuff. However, there is growing interest in developing a cuffless ambulatory blood pressure monitoring system to measure blood pressure continuously. One such approach is by utilizing bioimpedance sensors to build regression models. A practical problem with this approach is that the amount of data required to confidently train such a regression model can be prohibitive. In this paper, we propose the application of the domain-adversarial training neural network (DANN) method on our multitask learning (MTL) blood pressure estimation model, allowing for knowledge transfer between subjects. Our proposed model obtains average root mean square error (RMSE) of $4.80 \pm 0.74$ mmHg for diastolic blood pressure and $7.34 \pm 1.88$ mmHg for systolic blood pressure when using three minutes of training data, $4.64 \pm 0.60$ mmHg and $7.10 \pm 1.79$ respectively when using four minutes of training data, and $4.48 \pm 0.57$ mmHg and $6.79 \pm 1.70$ respectively when using five minutes of training data. DANN improves training with minimal data in comparison to both directly training and to training with a pretrained model from another subject, decreasing RMSE by $0.19$ to $0.26$ mmHg (diastolic) and by $0.46$ to $0.67$ mmHg (systolic) in comparison to the best baseline models. We observe that four minutes of training data is the minimum requirement for our framework to exceed ISO standards within this cohort of patients.
Abstract:Clinical notes contain a large amount of clinically valuable information that is ignored in many clinical decision support systems due to the difficulty that comes with mining that information. Recent work has found success leveraging deep learning models for the prediction of clinical outcomes using clinical notes. However, these models fail to provide clinically relevant and interpretable information that clinicians can utilize for informed clinical care. In this work, we augment a popular convolutional model with an attention mechanism and apply it to unstructured clinical notes for the prediction of ICU readmission and mortality. We find that the addition of the attention mechanism leads to competitive performance while allowing for the straightforward interpretation of predictions. We develop clear visualizations to present important spans of text for both individual predictions and high-risk cohorts. We then conduct a qualitative analysis and demonstrate that our model is consistently attending to clinically meaningful portions of the narrative for all of the outcomes that we explore.
Abstract:Cardiovascular disorders account for nearly 1 in 3 deaths in the United States. Care for these disorders are often determined during visits to acute care facilities, such as hospitals. While the length of stay in these settings represents just a small proportion of patients' lives, they account for a disproportionately large amount of decision making. To overcome this bias towards data from acute care settings, there is a need for longitudinal monitoring in patients with cardiovascular disorders. Longitudinal monitoring can provide a more comprehensive picture of patient health, allowing for more informed decision making. This work surveys the current field of sensing technologies and machine learning analytics that exist in the field of remote monitoring for cardiovascular disorders. We highlight three primary needs in the design of new smart health technologies: 1) the need for sensing technology that can track longitudinal trends in signs and symptoms of the cardiovascular disorder despite potentially infrequent, noisy, or missing data measurements; 2) the need for new analytic techniques that model data captured in a longitudinal, continual fashion to aid in the development of new risk prediction techniques and in tracking disease progression; and 3) the need for machine learning techniques that are personalized and interpretable, allowing for advancements in shared clinical decision making. We highlight these needs based upon the current state-of-the-art in smart health technologies and analytics and discuss the ample opportunities that exist in addressing all three needs in the development of smart health technologies and analytics applied to the field of cardiovascular disorders and care.
Abstract:Visual summarization of clinical data collected on patients contained within the electronic health record (EHR) may enable precise and rapid triage at the time of patient presentation to an emergency department (ED). The triage process is critical in the appropriate allocation of resources and in anticipating eventual patient disposition, typically admission to the hospital or discharge home. EHR data are high-dimensional and complex, but offer the opportunity to discover and characterize underlying data-driven patient phenotypes. These phenotypes will enable improved, personalized therapeutic decision making and prognostication. In this work, we focus on the challenge of two-dimensional patient projections. A low dimensional embedding offers visual interpretability lost in higher dimensions. While linear dimensionality reduction techniques such as principal component analysis are often used towards this aim, they are insufficient to describe the variance of patient data. In this work, we employ the newly-described non-linear embedding technique called uniform manifold approximation and projection (UMAP). UMAP seeks to capture both local and global structures in high-dimensional data. We then use Gaussian mixture models to identify clusters in the embedded data and use the adjusted Rand index (ARI) to establish stability in the discovery of these clusters. This technique is applied to five common clinical chief complaints from a real-world ED EHR dataset, describing the emergent properties of discovered clusters. We observe clinically-relevant cluster attributes, suggesting that visual embeddings of EHR data using non-linear dimensionality reduction is a promising approach to reveal data-driven patient phenotypes. In the five chief complaints, we find between 2 and 6 clusters, with the peak mean pairwise ARI between subsequent training iterations to range from 0.35 to 0.74.