Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yizhen Zhong

Improving Precancerous Case Characterization via Transformer-based Ensemble Learning

Dec 10, 2022

Yizhen Zhong, Jiajie Xiao, Thomas Vetterli, Mahan Matin, Ellen Loo, Jimmy Lin, Richard Bourgon, Ofer Shapira

Figure 1 for Improving Precancerous Case Characterization via Transformer-based Ensemble Learning

Figure 2 for Improving Precancerous Case Characterization via Transformer-based Ensemble Learning

Figure 3 for Improving Precancerous Case Characterization via Transformer-based Ensemble Learning

Figure 4 for Improving Precancerous Case Characterization via Transformer-based Ensemble Learning

Abstract:The application of natural language processing (NLP) to cancer pathology reports has been focused on detecting cancer cases, largely ignoring precancerous cases. Improving the characterization of precancerous adenomas assists in developing diagnostic tests for early cancer detection and prevention, especially for colorectal cancer (CRC). Here we developed transformer-based deep neural network NLP models to perform the CRC phenotyping, with the goal of extracting precancerous lesion attributes and distinguishing cancer and precancerous cases. We achieved 0.914 macro-F1 scores for classifying patients into negative, non-advanced adenoma, advanced adenoma and CRC. We further improved the performance to 0.923 using an ensemble of classifiers for cancer status classification and lesion size named entity recognition (NER). Our results demonstrated the potential of using NLP to leverage real-world health record data to facilitate the development of diagnostic tests for early cancer prevention.

Via

Access Paper or Ask Questions

Evaluating the Portability of an NLP System for Processing Echocardiograms: A Retrospective, Multi-site Observational Study

Apr 02, 2019

Prakash Adekkanattu, Guoqian Jiang, Yuan Luo, Paul R. Kingsbury, Zhenxing Xu, Luke V. Rasmussen, Jennifer A. Pacheco, Richard C. Kiefer, Daniel J. Stone, Pascal S. Brandt(+7 more)

Figure 1 for Evaluating the Portability of an NLP System for Processing Echocardiograms: A Retrospective, Multi-site Observational Study

Figure 2 for Evaluating the Portability of an NLP System for Processing Echocardiograms: A Retrospective, Multi-site Observational Study

Abstract:While natural language processing (NLP) of unstructured clinical narratives holds the potential for patient care and clinical research, portability of NLP approaches across multiple sites remains a major challenge. This study investigated the portability of an NLP system developed initially at the Department of Veterans Affairs (VA) to extract 27 key cardiac concepts from free-text or semi-structured echocardiograms from three academic medical centers: Weill Cornell Medicine, Mayo Clinic and Northwestern Medicine. While the NLP system showed high precision and recall measurements for four target concepts (aortic valve regurgitation, left atrium size at end systole, mitral valve regurgitation, tricuspid valve regurgitation) across all sites, we found moderate or poor results for the remaining concepts and the NLP system performance varied between individual sites.

* Under review with AMIA 2019

Via

Access Paper or Ask Questions

Characterizing Design Patterns of EHR-Driven Phenotype Extraction Algorithms

Nov 15, 2018

Yizhen Zhong, Luke Rasmussen, Yu Deng, Jennifer Pacheco, Maureen Smith, Justin Starren, Wei-Qi Wei, Peter Speltz, Joshua Denny, Nephi Walton(+3 more)

Figure 1 for Characterizing Design Patterns of EHR-Driven Phenotype Extraction Algorithms

Figure 2 for Characterizing Design Patterns of EHR-Driven Phenotype Extraction Algorithms

Figure 3 for Characterizing Design Patterns of EHR-Driven Phenotype Extraction Algorithms

Figure 4 for Characterizing Design Patterns of EHR-Driven Phenotype Extraction Algorithms

Abstract:The automatic development of phenotype algorithms from Electronic Health Record data with machine learning (ML) techniques is of great interest given the current practice is very time-consuming and resource intensive. The extraction of design patterns from phenotype algorithms is essential to understand their rationale and standard, with great potential to automate the development process. In this pilot study, we perform network visualization on the design patterns and their associations with phenotypes and sites. We classify design patterns using the fragments from previously annotated phenotype algorithms as the ground truth. The classification performance is used as a proxy for coherence at the attribution level. The bag-of-words representation with knowledge-based features generated a good performance in the classification task (0.79 macro-f1 scores). Good classification accuracy with simple features demonstrated the attribution coherence and the feasibility of automatic identification of design patterns. Our results point to both the feasibility and challenges of automatic identification of phenotyping design patterns, which would power the automatic development of phenotype algorithms.

* 4 pages, accepted by IEEE BIBM 2018 as short paper

Via

Access Paper or Ask Questions

Developing a Portable Natural Language Processing Based Phenotyping System

Jul 17, 2018

Himanshu Sharma, Chengsheng Mao, Yizhen Zhang, Haleh Vatani, Liang Yao, Yizhen Zhong, Luke Rasmussen, Guoqian Jiang, Jyotishman Pathak, Yuan Luo

Figure 1 for Developing a Portable Natural Language Processing Based Phenotyping System

Figure 2 for Developing a Portable Natural Language Processing Based Phenotyping System

Figure 3 for Developing a Portable Natural Language Processing Based Phenotyping System

Figure 4 for Developing a Portable Natural Language Processing Based Phenotyping System

Abstract:This paper presents a portable phenotyping system that is capable of integrating both rule-based and statistical machine learning based approaches. Our system utilizes UMLS to extract clinically relevant features from the unstructured text and then facilitates portability across different institutions and data systems by incorporating OHDSI's OMOP Common Data Model (CDM) to standardize necessary data elements. Our system can also store the key components of rule-based systems (e.g., regular expression matches) in the format of OMOP CDM, thus enabling the reuse, adaptation and extension of many existing rule-based clinical NLP systems. We experimented with our system on the corpus from i2b2's Obesity Challenge as a pilot study. Our system facilitates portable phenotyping of obesity and its 15 comorbidities based on the unstructured patient discharge summaries, while achieving a performance that often ranked among the top 10 of the challenge participants. This standardization enables a consistent application of numerous rule-based and machine learning based classification techniques downstream.

* 13 pages

Via

Access Paper or Ask Questions