Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Joon Lee

Unsupervised Feature Selection to Identify Important ICD-10 Codes for Machine Learning: A Case Study on a Coronary Artery Disease Patient Cohort

Mar 25, 2023

Peyman Ghasemi, Joon Lee

Abstract:The use of International Classification of Diseases (ICD) codes in healthcare presents a challenge in selecting relevant codes as features for machine learning models due to this system's large number of codes. In this study, we compared several unsupervised feature selection methods for an ICD code database of 49,075 coronary artery disease patients in Alberta, Canada. Specifically, we employed Laplacian Score, Unsupervised Feature Selection for Multi-Cluster Data, Autoencoder Inspired Unsupervised Feature Selection, Principal Feature Analysis, and Concrete Autoencoders with and without ICD tree weight adjustment to select the 100 best features from over 9,000 codes. We assessed the selected features based on their ability to reconstruct the initial feature space and predict 90-day mortality following discharge. Our findings revealed that the Concrete Autoencoder methods outperformed all other methods in both tasks. Furthermore, the weight adjustment in the Concrete Autoencoder method decreased the complexity of features.

* Submitted to AMIA 2023 Conference

Via

Access Paper or Ask Questions

Machine Learning Capability: A standardized metric using case difficulty with applications to individualized deployment of supervised machine learning

Feb 09, 2023

Adrienne Kline, Joon Lee

Abstract:Model evaluation is a critical component in supervised machine learning classification analyses. Traditional metrics do not currently incorporate case difficulty. This renders the classification results unbenchmarked for generalization. Item Response Theory (IRT) and Computer Adaptive Testing (CAT) with machine learning can benchmark datasets independent of the end-classification results. This provides high levels of case-level information regarding evaluation utility. To showcase, two datasets were used: 1) health-related and 2) physical science. For the health dataset a two-parameter IRT model, and for the physical science dataset a polytonomous IRT model, was used to analyze predictive features and place each case on a difficulty continuum. A CAT approach was used to ascertain the algorithms' performance and applicability to new data. This method provides an efficient way to benchmark data, using only a fraction of the dataset (less than 1%) and 22-60x more computationally efficient than traditional metrics. This novel metric, termed Machine Learning Capability (MLC) has additional benefits as it is unbiased to outcome classification and a standardized way to make model comparisons within and across datasets. MLC provides a metric on the limitation of supervised machine learning algorithms. In situations where the algorithm falls short, other input(s) are required for decision-making.

Via

Access Paper or Ask Questions

Prostate Cancer Malignancy Detection and localization from mpMRI using auto-Deep Learning: One Step Closer to Clinical Utilization

Jun 13, 2022

Weiwei Zong, Eric Carver, Simeng Zhu, Eric Schaff, Daniel Chapman, Joon Lee, Hassan Bagher Ebadian, Indrin Chetty, Benjamin Movsas, Winston Wen(+2 more)

Figure 1 for Prostate Cancer Malignancy Detection and localization from mpMRI using auto-Deep Learning: One Step Closer to Clinical Utilization

Figure 2 for Prostate Cancer Malignancy Detection and localization from mpMRI using auto-Deep Learning: One Step Closer to Clinical Utilization

Figure 3 for Prostate Cancer Malignancy Detection and localization from mpMRI using auto-Deep Learning: One Step Closer to Clinical Utilization

Figure 4 for Prostate Cancer Malignancy Detection and localization from mpMRI using auto-Deep Learning: One Step Closer to Clinical Utilization

Abstract:Automatic diagnosis of malignant prostate cancer patients from mpMRI has been studied heavily in the past years. Model interpretation and domain drift have been the main road blocks for clinical utilization. As an extension from our previous work where we trained a customized convolutional neural network on a public cohort with 201 patients and the cropped 2D patches around the region of interest were used as the input, the cropped 2.5D slices of the prostate glands were used as the input, and the optimal model were searched in the model space using autoKeras. Something different was peripheral zone (PZ) and central gland (CG) were trained and tested separately, the PZ detector and CG detector were demonstrated effectively in highlighting the most suspicious slices out of a sequence, hopefully to greatly ease the workload for the physicians.

* arXiv admin note: text overlap with arXiv:1903.12331

Via

Access Paper or Ask Questions

A Deep Dive into Understanding Tumor Foci Classification using Multiparametric MRI Based on Convolutional Neural Network

Apr 04, 2019

Weiwei Zong, Joon Lee, Chang Liu, Eric Carver, Aharon Feldman, Branislava Janic, Mohamed Elshaikh, Milan Pantelic, David Hearshen, Indrin Chetty(+2 more)

Figure 1 for A Deep Dive into Understanding Tumor Foci Classification using Multiparametric MRI Based on Convolutional Neural Network

Figure 2 for A Deep Dive into Understanding Tumor Foci Classification using Multiparametric MRI Based on Convolutional Neural Network

Figure 3 for A Deep Dive into Understanding Tumor Foci Classification using Multiparametric MRI Based on Convolutional Neural Network

Figure 4 for A Deep Dive into Understanding Tumor Foci Classification using Multiparametric MRI Based on Convolutional Neural Network

Abstract:Data scarcity has refrained deep learning models from making greater progress in prostate images analysis using multiparametric MRI. In this paper, an efficient convolutional neural network (CNN) was developed to classify lesion malignancy for prostate cancer patients, based on which model interpretation was systematically analyzed to bridge the gap between natural images and MR images, which is the first one of its kind in the literature. The problem of small sample size was addressed and successfully tackled by feeding the intermediate features into a traditional classification algorithm known as weighted extreme learning machine, with imbalanced distribution among output categories taken into consideration. Model trained on public data set was able to generalize to data from an independent institution to make accurate prediction. The generated saliency map was found to overlay well with the lesion and could benefit clinicians for diagnosing purpose.

Via

Access Paper or Ask Questions

Segmentation of the Prostatic Gland and the Intraprostatic Lesions on Multiparametic MRI Using Mask-RCNN

Apr 04, 2019

Zhenzhen Dai, Eric Carver, Chang Liu, Joon Lee, Aharon Feldman, Weiwei Zong, Milan Pantelic, Mohamed Elshaikh, Ning Wen

Figure 1 for Segmentation of the Prostatic Gland and the Intraprostatic Lesions on Multiparametic MRI Using Mask-RCNN

Figure 2 for Segmentation of the Prostatic Gland and the Intraprostatic Lesions on Multiparametic MRI Using Mask-RCNN

Figure 3 for Segmentation of the Prostatic Gland and the Intraprostatic Lesions on Multiparametic MRI Using Mask-RCNN

Figure 4 for Segmentation of the Prostatic Gland and the Intraprostatic Lesions on Multiparametic MRI Using Mask-RCNN

Abstract:Prostate cancer (PCa) is the most common cancer in men in the United States. Multiparametic magnetic resonance imaging (mp-MRI) has been explored by many researchers to targeted prostate biopsies and radiation therapy. However, assessment on mp-MRI can be subjective, development of computer-aided diagnosis systems to automatically delineate the prostate gland and the intraprostratic lesions (ILs) becomes important to facilitate with radiologists in clinical practice. In this paper, we first study the implementation of the Mask-RCNN model to segment the prostate and ILs. We trained and evaluated models on 120 patients from two different cohorts of patients. We also used 2D U-Net and 3D U-Net as benchmarks to segment the prostate and compared the model's performance. The contour variability of ILs using the algorithm was also benchmarked against the interobserver variability between two different radiation oncologists on 19 patients. Our results indicate that the Mask-RCNN model is able to reach state-of-art performance in the prostate segmentation and outperforms several competitive baselines in ILs segmentation.

Via

Access Paper or Ask Questions

Video Highlight Prediction Using Audience Chat Reactions

Jul 26, 2017

Cheng-Yang Fu, Joon Lee, Mohit Bansal, Alexander C. Berg

Figure 1 for Video Highlight Prediction Using Audience Chat Reactions

Figure 2 for Video Highlight Prediction Using Audience Chat Reactions

Figure 3 for Video Highlight Prediction Using Audience Chat Reactions

Figure 4 for Video Highlight Prediction Using Audience Chat Reactions

Abstract:Sports channel video portals offer an exciting domain for research on multimodal, multilingual analysis. We present methods addressing the problem of automatic video highlight prediction based on joint visual features and textual analysis of the real-world audience discourse with complex slang, in both English and traditional Chinese. We present a novel dataset based on League of Legends championships recorded from North American and Taiwanese Twitch.tv channels (will be released for further research), and demonstrate strong results on these using multimodal, character-level CNN-RNN model architectures.

* EMNLP 2017

Via

Access Paper or Ask Questions