Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nhat-Hao Pham

Correlation visualization under missing values: a comparison between imputation and direct parameter estimation methods

May 10, 2023

Nhat-Hao Pham, Khanh-Linh Vo, Mai Anh Vu, Thu Nguyen, Michael A. Riegler, Pål Halvorsen, Binh T. Nguyen

Figure 1 for Correlation visualization under missing values: a comparison between imputation and direct parameter estimation methods

Figure 2 for Correlation visualization under missing values: a comparison between imputation and direct parameter estimation methods

Figure 3 for Correlation visualization under missing values: a comparison between imputation and direct parameter estimation methods

Figure 4 for Correlation visualization under missing values: a comparison between imputation and direct parameter estimation methods

Abstract:Correlation matrix visualization is essential for understanding the relationships between variables in a dataset, but missing data can pose a significant challenge in estimating correlation coefficients. In this paper, we compare the effects of various missing data methods on the correlation plot, focusing on two common missing patterns: random and monotone. We aim to provide practical strategies and recommendations for researchers and practitioners in creating and analyzing the correlation plot. Our experimental results suggest that while imputation is commonly used for missing data, using imputed data for plotting the correlation matrix may lead to a significantly misleading inference of the relation between the features. We recommend using DPER, a direct parameter estimation approach, for plotting the correlation matrix based on its performance in the experiments.

Via

Access Paper or Ask Questions