Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Gao

Development and Validation of ML-DQA -- a Machine Learning Data Quality Assurance Framework for Healthcare

Aug 04, 2022

Mark Sendak, Gaurav Sirdeshmukh, Timothy Ochoa, Hayley Premo, Linda Tang, Kira Niederhoffer, Sarah Reed, Kaivalya Deshpande, Emily Sterrett, Melissa Bauer(+13 more)

Figure 1 for Development and Validation of ML-DQA -- a Machine Learning Data Quality Assurance Framework for Healthcare

Figure 2 for Development and Validation of ML-DQA -- a Machine Learning Data Quality Assurance Framework for Healthcare

Figure 3 for Development and Validation of ML-DQA -- a Machine Learning Data Quality Assurance Framework for Healthcare

Figure 4 for Development and Validation of ML-DQA -- a Machine Learning Data Quality Assurance Framework for Healthcare

Abstract:The approaches by which the machine learning and clinical research communities utilize real world data (RWD), including data captured in the electronic health record (EHR), vary dramatically. While clinical researchers cautiously use RWD for clinical investigations, ML for healthcare teams consume public datasets with minimal scrutiny to develop new algorithms. This study bridges this gap by developing and validating ML-DQA, a data quality assurance framework grounded in RWD best practices. The ML-DQA framework is applied to five ML projects across two geographies, different medical conditions, and different cohorts. A total of 2,999 quality checks and 24 quality reports were generated on RWD gathered on 247,536 patients across the five projects. Five generalizable practices emerge: all projects used a similar method to group redundant data element representations; all projects used automated utilities to build diagnosis and medication data elements; all projects used a common library of rules-based transformations; all projects used a unified approach to assign data quality checks to data elements; and all projects used a similar approach to clinical adjudication. An average of 5.8 individuals, including clinicians, data scientists, and trainees, were involved in implementing ML-DQA for each project and an average of 23.4 data elements per project were either transformed or removed in response to ML-DQA. This study demonstrates the importance role of ML-DQA in healthcare projects and provides teams a framework to conduct these essential activities.

* Presented at 2022 Machine Learning in Health Care Conference

Via

Access Paper or Ask Questions

Neural network based order parameter for phase transitions and its applications in high-entropy alloys

Sep 12, 2021

Junqi Yin, Zongrui Pei, Michael Gao

Figure 1 for Neural network based order parameter for phase transitions and its applications in high-entropy alloys

Figure 2 for Neural network based order parameter for phase transitions and its applications in high-entropy alloys

Figure 3 for Neural network based order parameter for phase transitions and its applications in high-entropy alloys

Figure 4 for Neural network based order parameter for phase transitions and its applications in high-entropy alloys

Abstract:Phase transition is one of the most important phenomena in nature and plays a central role in materials design. All phase transitions are characterized by suitable order parameters, including the order-disorder phase transition. However, finding a representative order parameter for complex systems is nontrivial, such as for high-entropy alloys. Given variational autoencoder's (VAE) strength of reducing high dimensional data into few principal components, here we coin a new concept of "VAE order parameter". We propose that the Manhattan distance in the VAE latent space can serve as a generic order parameter for order-disorder phase transitions. The physical properties of the order parameter are quantitatively interpreted and demonstrated by multiple refractory high-entropy alloys. Assisted by it, a generally applicable alloy design concept is proposed by mimicking the nature mixing of elements. Our physically interpretable "VAE order parameter" lays the foundation for the understanding of and alloy design by chemical ordering.

Via

Access Paper or Ask Questions

Variational Disentanglement for Rare Event Modeling

Sep 21, 2020

Zidi Xiu, Chenyang Tao, Michael Gao, Connor Davis, Benjamin Goldstein, Ricardo Henao

Figure 1 for Variational Disentanglement for Rare Event Modeling

Figure 2 for Variational Disentanglement for Rare Event Modeling

Figure 3 for Variational Disentanglement for Rare Event Modeling

Figure 4 for Variational Disentanglement for Rare Event Modeling

Abstract:Combining the increasing availability and abundance of healthcare data and the current advances in machine learning methods have created renewed opportunities to improve clinical decision support systems. However, in healthcare risk prediction applications, the proportion of cases with the condition (label) of interest is often very low relative to the available sample size. Though very prevalent in healthcare, such imbalanced classification settings are also common and challenging in many other scenarios. So motivated, we propose a variational disentanglement approach to semi-parametrically learn from rare events in heavily imbalanced classification problems. Specifically, we leverage the imposed extreme-distribution behavior on a latent space to extract information from low-prevalence events, and develop a robust prediction arm that joins the merits of the generalized additive model and isotonic neural nets. Results on synthetic studies and diverse real-world datasets, including mortality prediction on a COVID-19 cohort, demonstrate that the proposed approach outperforms existing alternatives.

Via

Access Paper or Ask Questions

"The Human Body is a Black Box": Supporting Clinical Decision-Making with Deep Learning

Dec 07, 2019

Mark Sendak, Madeleine Elish, Michael Gao, Joseph Futoma, William Ratliff, Marshall Nichols, Armando Bedoya, Suresh Balu, Cara O'Brien

Figure 1 for "The Human Body is a Black Box": Supporting Clinical Decision-Making with Deep Learning

Figure 2 for "The Human Body is a Black Box": Supporting Clinical Decision-Making with Deep Learning

Figure 3 for "The Human Body is a Black Box": Supporting Clinical Decision-Making with Deep Learning

Abstract:Machine learning technologies are increasingly developed for use in healthcare. While research communities have focused on creating state-of-the-art models, there has been less focus on real world implementation and the associated challenges to accuracy, fairness, accountability, and transparency that come from actual, situated use. Serious questions remain under examined regarding how to ethically build models, interpret and explain model output, recognize and account for biases, and minimize disruptions to professional expertise and work cultures. We address this gap in the literature and provide a detailed case study covering the development, implementation, and evaluation of Sepsis Watch, a machine learning-driven tool that assists hospital clinicians in the early diagnosis and treatment of sepsis. We, the team that developed and evaluated the tool, discuss our conceptualization of the tool not as a model deployed in the world but instead as a socio-technical system requiring integration into existing social and professional contexts. Rather than focusing on model interpretability to ensure a fair and accountable machine learning, we point toward four key values and practices that should be considered when developing machine learning to support clinical decision-making: rigorously define the problem in context, build relationships with stakeholders, respect professional discretion, and create ongoing feedback loops with stakeholders. Our work has significant implications for future research regarding mechanisms of institutional accountability and considerations for designing machine learning systems. Our work underscores the limits of model interpretability as a solution to ensure transparency, accuracy, and accountability in practice. Instead, our work demonstrates other means and goals to achieve FATML values in design and in practice.

* To appear at ACM FAT* 2020, Barcelona. Updated to camera-ready version

Via

Access Paper or Ask Questions