Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Andrew Ng

Evaluating and Improving the Effectiveness of Synthetic Chest X-Rays for Medical Image Analysis

Nov 27, 2024

Eva Prakash, Jeya Maria Jose Valanarasu, Zhihong Chen, Eduardo Pontes Reis, Andrew Johnston, Anuj Pareek, Christian Bluethgen, Sergios Gatidis, Cameron Olsen, Akshay Chaudhari(+2 more)

Abstract:Purpose: To explore best-practice approaches for generating synthetic chest X-ray images and augmenting medical imaging datasets to optimize the performance of deep learning models in downstream tasks like classification and segmentation. Materials and Methods: We utilized a latent diffusion model to condition the generation of synthetic chest X-rays on text prompts and/or segmentation masks. We explored methods like using a proxy model and using radiologist feedback to improve the quality of synthetic data. These synthetic images were then generated from relevant disease information or geometrically transformed segmentation masks and added to ground truth training set images from the CheXpert, CANDID-PTX, SIIM, and RSNA Pneumonia datasets to measure improvements in classification and segmentation model performance on the test sets. F1 and Dice scores were used to evaluate classification and segmentation respectively. One-tailed t-tests with Bonferroni correction assessed the statistical significance of performance improvements with synthetic data. Results: Across all experiments, the synthetic data we generated resulted in a maximum mean classification F1 score improvement of 0.150453 (CI: 0.099108-0.201798; P=0.0031) compared to using only real data. For segmentation, the maximum Dice score improvement was 0.14575 (CI: 0.108267-0.183233; P=0.0064). Conclusion: Best practices for generating synthetic chest X-ray images for downstream tasks include conditioning on single-disease labels or geometrically transformed segmentation masks, as well as potentially using proxy modeling for fine-tuning such generations.

Via

Access Paper or Ask Questions

Auto-Generating Weak Labels for Real & Synthetic Data to Improve Label-Scarce Medical Image Segmentation

Apr 25, 2024

Tanvi Deshpande, Eva Prakash, Elsie Gyang Ross, Curtis Langlotz, Andrew Ng, Jeya Maria Jose Valanarasu

Figure 1 for Auto-Generating Weak Labels for Real & Synthetic Data to Improve Label-Scarce Medical Image Segmentation

Figure 2 for Auto-Generating Weak Labels for Real & Synthetic Data to Improve Label-Scarce Medical Image Segmentation

Figure 3 for Auto-Generating Weak Labels for Real & Synthetic Data to Improve Label-Scarce Medical Image Segmentation

Figure 4 for Auto-Generating Weak Labels for Real & Synthetic Data to Improve Label-Scarce Medical Image Segmentation

Abstract:The high cost of creating pixel-by-pixel gold-standard labels, limited expert availability, and presence of diverse tasks make it challenging to generate segmentation labels to train deep learning models for medical imaging tasks. In this work, we present a new approach to overcome the hurdle of costly medical image labeling by leveraging foundation models like Segment Anything Model (SAM) and its medical alternate MedSAM. Our pipeline has the ability to generate weak labels for any unlabeled medical image and subsequently use it to augment label-scarce datasets. We perform this by leveraging a model trained on a few gold-standard labels and using it to intelligently prompt MedSAM for weak label generation. This automation eliminates the manual prompting step in MedSAM, creating a streamlined process for generating labels for both real and synthetic images, regardless of quantity. We conduct experiments on label-scarce settings for multiple tasks pertaining to modalities ranging from ultrasound, dermatology, and X-rays to demonstrate the usefulness of our pipeline. The code is available at https://github.com/stanfordmlgroup/Auto-Generate-WLs/.

* Accepted at MIDL 2024

Via

Access Paper or Ask Questions

Unlocking Robust Segmentation Across All Age Groups via Continual Learning

Apr 19, 2024

Chih-Ying Liu, Jeya Maria Jose Valanarasu, Camila Gonzalez, Curtis Langlotz, Andrew Ng, Sergios Gatidis

Figure 1 for Unlocking Robust Segmentation Across All Age Groups via Continual Learning

Figure 2 for Unlocking Robust Segmentation Across All Age Groups via Continual Learning

Figure 3 for Unlocking Robust Segmentation Across All Age Groups via Continual Learning

Abstract:Most deep learning models in medical imaging are trained on adult data with unclear performance on pediatric images. In this work, we aim to address this challenge in the context of automated anatomy segmentation in whole-body Computed Tomography (CT). We evaluate the performance of CT organ segmentation algorithms trained on adult data when applied to pediatric CT volumes and identify substantial age-dependent underperformance. We subsequently propose and evaluate strategies, including data augmentation and continual learning approaches, to achieve good segmentation accuracy across all age groups. Our best-performing model, trained using continual learning, achieves high segmentation accuracy on both adult and pediatric data (Dice scores of 0.90 and 0.84 respectively).

Via

Access Paper or Ask Questions

LymphoML: An interpretable artificial intelligence-based method identifies morphologic features that correlate with lymphoma subtype

Nov 20, 2023

Vivek Shankar, Xiaoli Yang, Vrishab Krishna, Brent Tan, Oscar Silva, Rebecca Rojansky, Andrew Ng, Fabiola Valvert, Edward Briercheck, David Weinstock(+3 more)

Figure 1 for LymphoML: An interpretable artificial intelligence-based method identifies morphologic features that correlate with lymphoma subtype

Figure 2 for LymphoML: An interpretable artificial intelligence-based method identifies morphologic features that correlate with lymphoma subtype

Figure 3 for LymphoML: An interpretable artificial intelligence-based method identifies morphologic features that correlate with lymphoma subtype

Figure 4 for LymphoML: An interpretable artificial intelligence-based method identifies morphologic features that correlate with lymphoma subtype

Abstract:The accurate classification of lymphoma subtypes using hematoxylin and eosin (H&E)-stained tissue is complicated by the wide range of morphological features these cancers can exhibit. We present LymphoML - an interpretable machine learning method that identifies morphologic features that correlate with lymphoma subtypes. Our method applies steps to process H&E-stained tissue microarray cores, segment nuclei and cells, compute features encompassing morphology, texture, and architecture, and train gradient-boosted models to make diagnostic predictions. LymphoML's interpretable models, developed on a limited volume of H&E-stained tissue, achieve non-inferior diagnostic accuracy to pathologists using whole-slide images and outperform black box deep-learning on a dataset of 670 cases from Guatemala spanning 8 lymphoma subtypes. Using SHapley Additive exPlanation (SHAP) analysis, we assess the impact of each feature on model prediction and find that nuclear shape features are most discriminative for DLBCL (F1-score: 78.7%) and classical Hodgkin lymphoma (F1-score: 74.5%). Finally, we provide the first demonstration that a model combining features from H&E-stained tissue with features from a standardized panel of 6 immunostains results in a similar diagnostic accuracy (85.3%) to a 46-stain panel (86.1%).

* To be published in Proceedings of the 3rd Machine Learning for Health symposium, Proceedings of Machine Learning Research (PMLR)

Via

Access Paper or Ask Questions

How to Train Your CheXDragon: Training Chest X-Ray Models for Transfer to Novel Tasks and Healthcare Systems

May 13, 2023

Cara Van Uden, Jeremy Irvin, Mars Huang, Nathan Dean, Jason Carr, Andrew Ng, Curtis Langlotz

Figure 1 for How to Train Your CheXDragon: Training Chest X-Ray Models for Transfer to Novel Tasks and Healthcare Systems

Figure 2 for How to Train Your CheXDragon: Training Chest X-Ray Models for Transfer to Novel Tasks and Healthcare Systems

Figure 3 for How to Train Your CheXDragon: Training Chest X-Ray Models for Transfer to Novel Tasks and Healthcare Systems

Figure 4 for How to Train Your CheXDragon: Training Chest X-Ray Models for Transfer to Novel Tasks and Healthcare Systems

Abstract:Self-supervised learning (SSL) enables label efficient training for machine learning models. This is essential for domains such as medical imaging, where labels are costly and time-consuming to curate. However, the most effective supervised or SSL strategy for transferring models to different healthcare systems or novel tasks is not well understood. In this work, we systematically experiment with a variety of supervised and self-supervised pretraining strategies using multimodal datasets of medical images (chest X-rays) and text (radiology reports). We then evaluate their performance on data from two external institutions with diverse sets of tasks. In addition, we experiment with different transfer learning strategies to effectively adapt these pretrained models to new tasks and healthcare systems. Our empirical results suggest that multimodal SSL gives substantial gains over unimodal SSL in performance across new healthcare systems and tasks, comparable to models pretrained with full supervision. We demonstrate additional performance gains with models further adapted to the new dataset and task, using multimodal domain-adaptive pretraining (DAPT), linear probing then finetuning (LP-FT), and both methods combined. We offer suggestions for alternative models to use in scenarios where not all of these additions are feasible. Our results provide guidance for improving the generalization of medical image interpretation models to new healthcare systems and novel tasks.

* 13 pages, 12 figures

Via

Access Paper or Ask Questions

DataPerf: Benchmarks for Data-Centric AI Development

Jul 20, 2022

Mark Mazumder, Colby Banbury, Xiaozhe Yao, Bojan Karlaš, William Gaviria Rojas, Sudnya Diamos, Greg Diamos, Lynn He, Douwe Kiela, David Jurado(+26 more)

Figure 1 for DataPerf: Benchmarks for Data-Centric AI Development

Figure 2 for DataPerf: Benchmarks for Data-Centric AI Development

Figure 3 for DataPerf: Benchmarks for Data-Centric AI Development

Figure 4 for DataPerf: Benchmarks for Data-Centric AI Development

Abstract:Machine learning (ML) research has generally focused on models, while the most prominent datasets have been employed for everyday ML tasks without regard for the breadth, difficulty, and faithfulness of these datasets to the underlying problem. Neglecting the fundamental importance of datasets has caused major problems involving data cascades in real-world applications and saturation of dataset-driven criteria for model quality, hindering research growth. To solve this problem, we present DataPerf, a benchmark package for evaluating ML datasets and dataset-working algorithms. We intend it to enable the "data ratchet," in which training sets will aid in evaluating test sets on the same problems, and vice versa. Such a feedback-driven strategy will generate a virtuous loop that will accelerate development of data-centric AI. The MLCommons Association will maintain DataPerf.

Via

Access Paper or Ask Questions

Effective Data Fusion with Generalized Vegetation Index: Evidence from Land Cover Segmentation in Agriculture

May 07, 2020

Hao Sheng, Xiao Chen, Jingyi Su, Ram Rajagopal, Andrew Ng

Figure 1 for Effective Data Fusion with Generalized Vegetation Index: Evidence from Land Cover Segmentation in Agriculture

Figure 2 for Effective Data Fusion with Generalized Vegetation Index: Evidence from Land Cover Segmentation in Agriculture

Figure 3 for Effective Data Fusion with Generalized Vegetation Index: Evidence from Land Cover Segmentation in Agriculture

Figure 4 for Effective Data Fusion with Generalized Vegetation Index: Evidence from Land Cover Segmentation in Agriculture

Abstract:How can we effectively leverage the domain knowledge from remote sensing to better segment agriculture land cover from satellite images? In this paper, we propose a novel, model-agnostic, data-fusion approach for vegetation-related computer vision tasks. Motivated by the various Vegetation Indices (VIs), which are introduced by domain experts, we systematically reviewed the VIs that are widely used in remote sensing and their feasibility to be incorporated in deep neural networks. To fully leverage the Near-Infrared channel, the traditional Red-Green-Blue channels, and Vegetation Index or its variants, we propose a Generalized Vegetation Index (GVI), a lightweight module that can be easily plugged into many neural network architectures to serve as an additional information input. To smoothly train models with our GVI, we developed an Additive Group Normalization (AGN) module that does not require extra parameters of the prescribed neural networks. Our approach has improved the IoUs of vegetation-related classes by 0.9-1.3 percent and consistently improves the overall mIoU by 2 percent on our baseline.

* CVPR 2020 - Vision for Agriculture; https://www.agriculture-vision.com

Via

Access Paper or Ask Questions

The 1st Agriculture-Vision Challenge: Methods and Results

Apr 23, 2020

Mang Tik Chiu, Xingqian Xu, Kai Wang, Jennifer Hobbs, Naira Hovakimyan, Thomas S. Huang, Honghui Shi, Yunchao Wei, Zilong Huang, Alexander Schwing(+32 more)

Figure 1 for The 1st Agriculture-Vision Challenge: Methods and Results

Figure 2 for The 1st Agriculture-Vision Challenge: Methods and Results

Figure 3 for The 1st Agriculture-Vision Challenge: Methods and Results

Figure 4 for The 1st Agriculture-Vision Challenge: Methods and Results

Abstract:The first Agriculture-Vision Challenge aims to encourage research in developing novel and effective algorithms for agricultural pattern recognition from aerial images, especially for the semantic segmentation task associated with our challenge dataset. Around 57 participating teams from various countries compete to achieve state-of-the-art in aerial agriculture semantic segmentation. The Agriculture-Vision Challenge Dataset was employed, which comprises of 21,061 aerial and multi-spectral farmland images. This paper provides a summary of notable methods and results in the challenge. Our submission server and leaderboard will continue to open for researchers that are interested in this challenge dataset and task; the link can be found here.

* CVPR 2020 Workshop

Via

Access Paper or Ask Questions

Predicting Inpatient Discharge Prioritization With Electronic Health Records

Dec 02, 2018

Anand Avati, Stephen Pfohl, Chris Lin, Thao Nguyen, Meng Zhang, Philip Hwang, Jessica Wetstone, Kenneth Jung, Andrew Ng, Nigam H. Shah

Figure 1 for Predicting Inpatient Discharge Prioritization With Electronic Health Records

Figure 2 for Predicting Inpatient Discharge Prioritization With Electronic Health Records

Figure 3 for Predicting Inpatient Discharge Prioritization With Electronic Health Records

Figure 4 for Predicting Inpatient Discharge Prioritization With Electronic Health Records

Abstract:Identifying patients who will be discharged within 24 hours can improve hospital resource management and quality of care. We studied this problem using eight years of Electronic Health Records (EHR) data from Stanford Hospital. We fit models to predict 24 hour discharge across the entire inpatient population. The best performing models achieved an area under the receiver-operator characteristic curve (AUROC) of 0.85 and an AUPRC of 0.53 on a held out test set. This model was also well calibrated. Finally, we analyzed the utility of this model in a decision theoretic framework to identify regions of ROC space in which using the model increases expected utility compared to the trivial always negative or always positive classifiers.

Via

Access Paper or Ask Questions

Countdown Regression: Sharp and Calibrated Survival Predictions

Jun 21, 2018

Anand Avati, Tony Duan, Kenneth Jung, Nigam H. Shah, Andrew Ng

Figure 1 for Countdown Regression: Sharp and Calibrated Survival Predictions

Figure 2 for Countdown Regression: Sharp and Calibrated Survival Predictions

Figure 3 for Countdown Regression: Sharp and Calibrated Survival Predictions

Figure 4 for Countdown Regression: Sharp and Calibrated Survival Predictions

Abstract:Personalized probabilistic forecasts of time to event (such as mortality) can be crucial in decision making, especially in the clinical setting. Inspired by ideas from the meteorology literature, we approach this problem through the paradigm of maximizing sharpness of prediction distributions, subject to calibration. In regression problems, it has been shown that optimizing the continuous ranked probability score (CRPS) instead of maximum likelihood leads to sharper prediction distributions while maintaining calibration. We introduce the Survival-CRPS, a generalization of the CRPS to the time to event setting, and present right-censored and interval-censored variants. To holistically evaluate the quality of predicted distributions over time to event, we present the Survival-AUPRC evaluation metric, an analog to area under the precision-recall curve. We apply these ideas by building a recurrent neural network for mortality prediction, using an Electronic Health Record dataset covering millions of patients. We demonstrate significant benefits in models trained by the Survival-CRPS objective instead of maximum likelihood.

Via

Access Paper or Ask Questions