Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marcel Jiřina

Missing Features Reconstruction Using a Wasserstein Generative Adversarial Imputation Network

Jun 21, 2020

Magda Friedjungová, Daniel Vašata, Maksym Balatsko, Marcel Jiřina

Figure 1 for Missing Features Reconstruction Using a Wasserstein Generative Adversarial Imputation Network

Figure 2 for Missing Features Reconstruction Using a Wasserstein Generative Adversarial Imputation Network

Figure 3 for Missing Features Reconstruction Using a Wasserstein Generative Adversarial Imputation Network

Figure 4 for Missing Features Reconstruction Using a Wasserstein Generative Adversarial Imputation Network

Abstract:Missing data is one of the most common preprocessing problems. In this paper, we experimentally research the use of generative and non-generative models for feature reconstruction. Variational Autoencoder with Arbitrary Conditioning (VAEAC) and Generative Adversarial Imputation Network (GAIN) were researched as representatives of generative models, while the denoising autoencoder (DAE) represented non-generative models. Performance of the models is compared to traditional methods k-nearest neighbors (k-NN) and Multiple Imputation by Chained Equations (MICE). Moreover, we introduce WGAIN as the Wasserstein modification of GAIN, which turns out to be the best imputation model when the degree of missingness is less than or equal to 30%. Experiments were performed on real-world and artificial datasets with continuous features where different percentages of features, varying from 10% to 50%, were missing. Evaluation of algorithms was done by measuring the accuracy of the classification model previously trained on the uncorrupted dataset. The results show that GAIN and especially WGAIN are the best imputers regardless of the conditions. In general, they outperform or are comparative to MICE, k-NN, DAE, and VAEAC.

* Computational Science - ICCS 2020. ICCS 2020. Lecture Notes in Computer Science 12140 (2020) 225-239
* Preprint of the conference paper (ICCS 2020), part of the Lecture Notes in Computer Science

Via

Access Paper or Ask Questions

Unsupervised Latent Space Translation Network

Mar 20, 2020

Magda Friedjungová, Daniel Vašata, Tomáš Chobola, Marcel Jiřina

Figure 1 for Unsupervised Latent Space Translation Network

Figure 2 for Unsupervised Latent Space Translation Network

Abstract:One task that is often discussed in a computer vision is the mapping of an image from one domain to a corresponding image in another domain known as image-to-image translation. Currently there are several approaches solving this task. In this paper, we present an enhancement of the UNIT framework that aids in removing its main drawbacks. More specifically, we introduce an additional adversarial discriminator on the latent representation used instead of VAE, which enforces the latent space distributions of both domains to be similar. On MNIST and USPS domain adaptation tasks, this approach greatly outperforms competing approaches.

* To be published in conference proceedings of ESANN 2020

Via

Access Paper or Ask Questions

Missing Features Reconstruction and Its Impact on Classification Accuracy

Nov 09, 2019

Magda Friedjungová, Daniel Vašata, Marcel Jiřina

Figure 1 for Missing Features Reconstruction and Its Impact on Classification Accuracy

Figure 2 for Missing Features Reconstruction and Its Impact on Classification Accuracy

Figure 3 for Missing Features Reconstruction and Its Impact on Classification Accuracy

Figure 4 for Missing Features Reconstruction and Its Impact on Classification Accuracy

Abstract:In real-world applications, we can encounter situations when a well-trained model has to be used to predict from a damaged dataset. The damage caused by missing or corrupted values can be either on the level of individual instances or on the level of entire features. Both situations have a negative impact on the usability of the model on such a dataset. This paper focuses on the scenario where entire features are missing which can be understood as a specific case of transfer learning. Our aim is to experimentally research the influence of various imputation methods on the performance of several classification models. The imputation impact is researched on a combination of traditional methods such as k-NN, linear regression, and MICE compared to modern imputation methods such as multi-layer perceptron (MLP) and gradient boosted trees (XGBT). For linear regression, MLP, and XGBT we also propose two approaches to using them for multiple features imputation. The experiments were performed on both real world and artificial datasets with continuous features where different numbers of features, varying from one feature to 50%, were missing. The results show that MICE and linear regression are generally good imputers regardless of the conditions. On the other hand, the performance of MLP and XGBT is strongly dataset dependent. Their performance is the best in some cases, but more often they perform worse than MICE or linear regression.

* Computational Science - ICCS 2019. ICCS 2019. Lecture Notes in Computer Science 11538 (2019) 207-220
* Preprint of the conference paper (ICCS 2019), part of the Lecture Notes in Computer Science

Via

Access Paper or Ask Questions