Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniel Luna

Understanding the impact of class imbalance on the performance of chest x-ray image classifiers

Dec 23, 2021

Candelaria Mosquera, Luciana Ferrer, Diego Milone, Daniel Luna, Enzo Ferrante

Figure 1 for Understanding the impact of class imbalance on the performance of chest x-ray image classifiers

Figure 2 for Understanding the impact of class imbalance on the performance of chest x-ray image classifiers

Figure 3 for Understanding the impact of class imbalance on the performance of chest x-ray image classifiers

Figure 4 for Understanding the impact of class imbalance on the performance of chest x-ray image classifiers

Abstract:This work aims to understand the impact of class imbalance on the performance of chest x-ray classifiers, in light of the standard evaluation practices adopted by researchers in terms of discrimination and calibration performance. Firstly, we conducted a literature study to analyze common scientific practices and confirmed that: (1) even when dealing with highly imbalanced datasets, the community tends to use metrics that are dominated by the majority class; and (2) it is still uncommon to include calibration studies for chest x-ray classifiers, albeit its importance in the context of healthcare. Secondly, we perform a systematic experiment on two major chest x-ray datasets to explore the behavior of several performance metrics under different class ratios and show that widely adopted metrics can conceal the performance in the minority class. Finally, we propose the adoption of two alternative metrics, the precision-recall curve and the Balanced Brier score, which better reflect the performance of the system in such scenarios. Our results indicate that current evaluation practices adopted by the research community for chest x-ray classifiers may not reflect the performance of such systems for computer-aided diagnosis in real clinical scenarios, and suggest alternatives to improve this situation.

Via

Access Paper or Ask Questions