Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Seth Berkowitz

Bidirectional Captioning for Clinically Accurate and Interpretable Models

Oct 30, 2023

Keegan Quigley, Miriam Cha, Josh Barua, Geeticka Chauhan, Seth Berkowitz, Steven Horng, Polina Golland

Figure 1 for Bidirectional Captioning for Clinically Accurate and Interpretable Models

Figure 2 for Bidirectional Captioning for Clinically Accurate and Interpretable Models

Figure 3 for Bidirectional Captioning for Clinically Accurate and Interpretable Models

Figure 4 for Bidirectional Captioning for Clinically Accurate and Interpretable Models

Abstract:Vision-language pretraining has been shown to produce high-quality visual encoders which transfer efficiently to downstream computer vision tasks. While generative language models have gained widespread attention, image captioning has thus far been mostly overlooked as a form of cross-modal pretraining in favor of contrastive learning, especially in medical image analysis. In this paper, we experiment with bidirectional captioning of radiology reports as a form of pretraining and compare the quality and utility of learned embeddings with those from contrastive pretraining methods. We optimize a CNN encoder, transformer decoder architecture named RadTex for the radiology domain. Results show that not only does captioning pretraining yield visual encoders that are competitive with contrastive pretraining (CheXpert competition multi-label AUC of 89.4%), but also that our transformer decoder is capable of generating clinically relevant reports (captioning macro-F1 score of 0.349 using CheXpert labeler) and responding to prompts with targeted, interactive outputs.

* 12 pages, 7 figures. Code release to follow

Via

Access Paper or Ask Questions

Sample-Specific Debiasing for Better Image-Text Models

Apr 25, 2023

Peiqi Wang, Yingcheng Liu, Ching-Yun Ko, William M. Wells, Seth Berkowitz, Steven Horng, Polina Golland

Figure 1 for Sample-Specific Debiasing for Better Image-Text Models

Figure 2 for Sample-Specific Debiasing for Better Image-Text Models

Figure 3 for Sample-Specific Debiasing for Better Image-Text Models

Figure 4 for Sample-Specific Debiasing for Better Image-Text Models

Abstract:Self-supervised representation learning on image-text data facilitates crucial medical applications, such as image classification, visual grounding, and cross-modal retrieval. One common approach involves contrasting semantically similar (positive) and dissimilar (negative) pairs of data points. Drawing negative samples uniformly from the training data set introduces false negatives, i.e., samples that are treated as dissimilar but belong to the same class. In healthcare data, the underlying class distribution is nonuniform, implying that false negatives occur at a highly variable rate. To improve the quality of learned representations, we develop a novel approach that corrects for false negatives. Our method can be viewed as a variant of debiased constrastive learning that uses estimated sample-specific class probabilities. We provide theoretical analysis of the objective function and demonstrate the proposed approach on both image and paired image-text data sets. Our experiments demonstrate empirical advantages of sample-specific debiasing.

Via

Access Paper or Ask Questions

Using Multiple Instance Learning to Build Multimodal Representations

Dec 11, 2022

Peiqi Wang, William M. Wells, Seth Berkowitz, Steven Horng, Polina Golland

Abstract:Image-text multimodal representation learning aligns data across modalities and enables important medical applications, e.g., image classification, visual grounding, and cross-modal retrieval. In this work, we establish a connection between multimodal representation learning and multiple instance learning. Based on this connection, we propose a generic framework for constructing permutation-invariant score functions with many existing multimodal representation learning approaches as special cases. Furthermore, we use the framework to derive a novel contrastive learning approach and demonstrate that our method achieves state-of-the-art results on a number of downstream tasks.

Via

Access Paper or Ask Questions

RadTex: Learning Efficient Radiograph Representations from Text Reports

Aug 05, 2022

Keegan Quigley, Miriam Cha, Ruizhi Liao, Geeticka Chauhan, Steven Horng, Seth Berkowitz, Polina Golland

Figure 1 for RadTex: Learning Efficient Radiograph Representations from Text Reports

Figure 2 for RadTex: Learning Efficient Radiograph Representations from Text Reports

Figure 3 for RadTex: Learning Efficient Radiograph Representations from Text Reports

Figure 4 for RadTex: Learning Efficient Radiograph Representations from Text Reports

Abstract:Automated analysis of chest radiography using deep learning has tremendous potential to enhance the clinical diagnosis of diseases in patients. However, deep learning models typically require large amounts of annotated data to achieve high performance -- often an obstacle to medical domain adaptation. In this paper, we build a data-efficient learning framework that utilizes radiology reports to improve medical image classification performance with limited labeled data (fewer than 1000 examples). Specifically, we examine image-captioning pretraining to learn high-quality medical image representations that train on fewer examples. Following joint pretraining of a convolutional encoder and transformer decoder, we transfer the learned encoder to various classification tasks. Averaged over 9 pathologies, we find that our model achieves higher classification performance than ImageNet-supervised and in-domain supervised pretraining when labeled training data is limited.

* Accepted to Resource Efficient Medical Image Analysis (REMIA) Workshop, MICCAI 2022

Via

Access Paper or Ask Questions

Image Classification with Consistent Supporting Evidence

Nov 13, 2021

Peiqi Wang, Ruizhi Liao, Daniel Moyer, Seth Berkowitz, Steven Horng, Polina Golland

Figure 1 for Image Classification with Consistent Supporting Evidence

Figure 2 for Image Classification with Consistent Supporting Evidence

Figure 3 for Image Classification with Consistent Supporting Evidence

Figure 4 for Image Classification with Consistent Supporting Evidence

Abstract:Adoption of machine learning models in healthcare requires end users' trust in the system. Models that provide additional supportive evidence for their predictions promise to facilitate adoption. We define consistent evidence to be both compatible and sufficient with respect to model predictions. We propose measures of model inconsistency and regularizers that promote more consistent evidence. We demonstrate our ideas in the context of edema severity grading from chest radiographs. We demonstrate empirically that consistent models provide competitive performance while supporting interpretation.

* 13 pages, 6 figures, proceedings of the Machine Learning for Health NeurIPS Workshop, 2021

Via

Access Paper or Ask Questions

Multimodal Representation Learning via Maximization of Local Mutual Information

Mar 08, 2021

Ruizhi Liao, Daniel Moyer, Miriam Cha, Keegan Quigley, Seth Berkowitz, Steven Horng, Polina Golland, William M. Wells

Figure 1 for Multimodal Representation Learning via Maximization of Local Mutual Information

Figure 2 for Multimodal Representation Learning via Maximization of Local Mutual Information

Figure 3 for Multimodal Representation Learning via Maximization of Local Mutual Information

Figure 4 for Multimodal Representation Learning via Maximization of Local Mutual Information

Abstract:We propose and demonstrate a representation learning approach by maximizing the mutual information between local features of images and text. The goal of this approach is to learn useful image representations by taking advantage of the rich information contained in the free text that describes the findings in the image. Our method learns image and text encoders by encouraging the resulting representations to exhibit high local mutual information. We make use of recent advances in mutual information estimation with neural network discriminators. We argue that, typically, the sum of local mutual information is a lower bound on the global mutual information. Our experimental results in the downstream image classification tasks demonstrate the advantages of using local features for image-text representation learning.

Via

Access Paper or Ask Questions

Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment

Aug 22, 2020

Geeticka Chauhan, Ruizhi Liao, William Wells, Jacob Andreas, Xin Wang, Seth Berkowitz, Steven Horng, Peter Szolovits, Polina Golland

Figure 1 for Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment

Figure 2 for Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment

Figure 3 for Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment

Figure 4 for Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment

Abstract:We propose and demonstrate a novel machine learning algorithm that assesses pulmonary edema severity from chest radiographs. While large publicly available datasets of chest radiographs and free-text radiology reports exist, only limited numerical edema severity labels can be extracted from radiology reports. This is a significant challenge in learning such models for image classification. To take advantage of the rich information present in the radiology reports, we develop a neural network model that is trained on both images and free-text to assess pulmonary edema severity from chest radiographs at inference time. Our experimental results suggest that the joint image-text representation learning improves the performance of pulmonary edema assessment compared to a supervised model trained on images only. We also show the use of the text for explaining the image classification by the joint model. To the best of our knowledge, our approach is the first to leverage free-text radiology reports for improving the image model performance in this application. Our code is available at https://github.com/RayRuizhiLiao/joint_chestxray.

* The two first authors contributed equally. To be published in the proceedings of MICCAI 2020

Via

Access Paper or Ask Questions

Semi-supervised Learning for Quantification of Pulmonary Edema in Chest X-Ray Images

Apr 10, 2019

Ruizhi Liao, Jonathan Rubin, Grace Lam, Seth Berkowitz, Sandeep Dalal, William Wells, Steven Horng, Polina Golland

Figure 1 for Semi-supervised Learning for Quantification of Pulmonary Edema in Chest X-Ray Images

Figure 2 for Semi-supervised Learning for Quantification of Pulmonary Edema in Chest X-Ray Images

Figure 3 for Semi-supervised Learning for Quantification of Pulmonary Edema in Chest X-Ray Images

Figure 4 for Semi-supervised Learning for Quantification of Pulmonary Edema in Chest X-Ray Images

Abstract:We propose and demonstrate machine learning algorithms to assess the severity of pulmonary edema in chest x-ray images of congestive heart failure patients. Accurate assessment of pulmonary edema in heart failure is critical when making treatment and disposition decisions. Our work is grounded in a large-scale clinical dataset of over 300,000 x-ray images with associated radiology reports. While edema severity labels can be extracted unambiguously from a small fraction of the radiology reports, accurate annotation is challenging in most cases. To take advantage of the unlabeled images, we develop a Bayesian model that includes a variational auto-encoder for learning a latent representation from the entire image set trained jointly with a regressor that employs this representation for predicting pulmonary edema severity. Our experimental results suggest that modeling the distribution of images jointly with the limited labels improves the accuracy of pulmonary edema scoring compared to a strictly supervised approach. To the best of our knowledge, this is the first attempt to employ machine learning algorithms to automatically and quantitatively assess the severity of pulmonary edema in chest x-ray images.

Via

Access Paper or Ask Questions