Abstract:Electronic health records (EHRs), digital collections of patient healthcare events and observations, are ubiquitous in medicine and critical to healthcare delivery, operations, and research. Despite this central role, EHRs are notoriously difficult to process automatically. Well over half of the information stored within EHRs is in the form of unstructured text (e.g. provider notes, operation reports) and remains largely untapped for secondary use. Recently, however, newer neural network and deep learning approaches to Natural Language Processing (NLP) have made considerable advances, outperforming traditional statistical and rule-based systems on a variety of tasks. In this survey paper, we summarize current neural NLP methods for EHR applications. We focus on a broad scope of tasks, namely, classification and prediction, word embeddings, extraction, generation, and other topics such as question answering, phenotyping, knowledge graphs, medical dialogue, multilinguality, interpretability, etc.
Abstract:Blood pressure monitoring is an essential component of hypertension management and in the prediction of associated comorbidities. Blood pressure is a dynamic vital sign with frequent changes throughout a given day. Capturing blood pressure remotely and frequently (also known as ambulatory blood pressure monitoring) has traditionally been achieved by measuring blood pressure at discrete intervals using an inflatable cuff. However, there is growing interest in developing a cuffless ambulatory blood pressure monitoring system to measure blood pressure continuously. One such approach is by utilizing bioimpedance sensors to build regression models. A practical problem with this approach is that the amount of data required to confidently train such a regression model can be prohibitive. In this paper, we propose the application of the domain-adversarial training neural network (DANN) method on our multitask learning (MTL) blood pressure estimation model, allowing for knowledge transfer between subjects. Our proposed model obtains average root mean square error (RMSE) of $4.80 \pm 0.74$ mmHg for diastolic blood pressure and $7.34 \pm 1.88$ mmHg for systolic blood pressure when using three minutes of training data, $4.64 \pm 0.60$ mmHg and $7.10 \pm 1.79$ respectively when using four minutes of training data, and $4.48 \pm 0.57$ mmHg and $6.79 \pm 1.70$ respectively when using five minutes of training data. DANN improves training with minimal data in comparison to both directly training and to training with a pretrained model from another subject, decreasing RMSE by $0.19$ to $0.26$ mmHg (diastolic) and by $0.46$ to $0.67$ mmHg (systolic) in comparison to the best baseline models. We observe that four minutes of training data is the minimum requirement for our framework to exceed ISO standards within this cohort of patients.
Abstract:Cardiovascular disorders account for nearly 1 in 3 deaths in the United States. Care for these disorders are often determined during visits to acute care facilities, such as hospitals. While the length of stay in these settings represents just a small proportion of patients' lives, they account for a disproportionately large amount of decision making. To overcome this bias towards data from acute care settings, there is a need for longitudinal monitoring in patients with cardiovascular disorders. Longitudinal monitoring can provide a more comprehensive picture of patient health, allowing for more informed decision making. This work surveys the current field of sensing technologies and machine learning analytics that exist in the field of remote monitoring for cardiovascular disorders. We highlight three primary needs in the design of new smart health technologies: 1) the need for sensing technology that can track longitudinal trends in signs and symptoms of the cardiovascular disorder despite potentially infrequent, noisy, or missing data measurements; 2) the need for new analytic techniques that model data captured in a longitudinal, continual fashion to aid in the development of new risk prediction techniques and in tracking disease progression; and 3) the need for machine learning techniques that are personalized and interpretable, allowing for advancements in shared clinical decision making. We highlight these needs based upon the current state-of-the-art in smart health technologies and analytics and discuss the ample opportunities that exist in addressing all three needs in the development of smart health technologies and analytics applied to the field of cardiovascular disorders and care.
Abstract:We address the problem of defining a network graph on a large collection of classes. Each class is comprised of a collection of data points, sampled in a non i.i.d. way, from some unknown underlying distribution. The application we consider in this paper is a large scale high dimensional survey of people living in the US, and the question of how similar or different are the various counties in which these people live. We use a co-clustering diffusion metric to learn the underlying distribution of people, and build an approximate earth mover's distance algorithm using this data adaptive transportation cost.
Abstract:In this paper, we build an organization of high-dimensional datasets that cannot be cleanly embedded into a low-dimensional representation due to missing entries and a subset of the features being irrelevant to modeling functions of interest. Our algorithm begins by defining coarse neighborhoods of the points and defining an expected empirical function value on these neighborhoods. We then generate new non-linear features with deep net representations tuned to model the approximate function, and re-organize the geometry of the points with respect to the new representation. Finally, the points are locally z-scored to create an intrinsic geometric organization which is independent of the parameters of the deep net, a geometry designed to assure smoothness with respect to the empirical function. We examine this approach on data from the Center for Medicare and Medicaid Services Hospital Quality Initiative, and generate an intrinsic low-dimensional organization of the hospitals that is smooth with respect to an expert driven function of quality.