Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Mimic-IV-ICD: A new benchmark for eXtreme MultiLabel Classification

Apr 27, 2023

Thanh-Tung Nguyen, Viktor Schlegel, Abhinav Kashyap, Stefan Winkler, Shao-Syuan Huang, Jie-Jyun Liu, Chih-Jen Lin

Figure 1 for Mimic-IV-ICD: A new benchmark for eXtreme MultiLabel Classification

Figure 2 for Mimic-IV-ICD: A new benchmark for eXtreme MultiLabel Classification

Figure 3 for Mimic-IV-ICD: A new benchmark for eXtreme MultiLabel Classification

Figure 4 for Mimic-IV-ICD: A new benchmark for eXtreme MultiLabel Classification

Share this with someone who'll enjoy it:

Abstract:Clinical notes are assigned ICD codes - sets of codes for diagnoses and procedures. In the recent years, predictive machine learning models have been built for automatic ICD coding. However, there is a lack of widely accepted benchmarks for automated ICD coding models based on large-scale public EHR data. This paper proposes a public benchmark suite for ICD-10 coding using a large EHR dataset derived from MIMIC-IV, the most recent public EHR dataset. We implement and compare several popular methods for ICD coding prediction tasks to standardize data preprocessing and establish a comprehensive ICD coding benchmark dataset. This approach fosters reproducibility and model comparison, accelerating progress toward employing automated ICD coding in future studies. Furthermore, we create a new ICD-9 benchmark using MIMIC-IV data, providing more data points and a higher number of ICD codes than MIMIC-III. Our open-source code offers easy access to data processing steps, benchmark creation, and experiment replication for those with MIMIC-IV access, providing insights, guidance, and protocols to efficiently develop ICD coding models.

* Benchmark, Multilabel, Classification

View paper on

Share this with someone who'll enjoy it:

Title:Mimic-IV-ICD: A new benchmark for eXtreme MultiLabel Classification

Paper and Code