Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gael Lederrey

ciDATGAN: Conditional Inputs for Tabular GANs

Oct 05, 2022

Gael Lederrey, Tim Hillel, Michel Bierlaire

Figure 1 for ciDATGAN: Conditional Inputs for Tabular GANs

Figure 2 for ciDATGAN: Conditional Inputs for Tabular GANs

Figure 3 for ciDATGAN: Conditional Inputs for Tabular GANs

Figure 4 for ciDATGAN: Conditional Inputs for Tabular GANs

Abstract:Conditionality has become a core component for Generative Adversarial Networks (GANs) for generating synthetic images. GANs are usually using latent conditionality to control the generation process. However, tabular data only contains manifest variables. Thus, latent conditionality either restricts the generated data or does not produce sufficiently good results. Therefore, we propose a new methodology to include conditionality in tabular GANs inspired by image completion methods. This article presents ciDATGAN, an evolution of the Directed Acyclic Tabular GAN (DATGAN) that has already been shown to outperform state-of-the-art tabular GAN models. First, we show that the addition of conditional inputs does hinder the model's performance compared to its predecessor. Then, we demonstrate that ciDATGAN can be used to unbias datasets with the help of well-chosen conditional inputs. Finally, it shows that ciDATGAN can learn the logic behind the data and, thus, be used to complete large synthetic datasets using data from a smaller feeder dataset.

* Technical report, 21 pages

Via

Access Paper or Ask Questions

DATGAN: Integrating expert knowledge into deep learning for synthetic tabular data

Mar 07, 2022

Gael Lederrey, Tim Hillel, Michel Bierlaire

Figure 1 for DATGAN: Integrating expert knowledge into deep learning for synthetic tabular data

Figure 2 for DATGAN: Integrating expert knowledge into deep learning for synthetic tabular data

Figure 3 for DATGAN: Integrating expert knowledge into deep learning for synthetic tabular data

Figure 4 for DATGAN: Integrating expert knowledge into deep learning for synthetic tabular data

Abstract:Synthetic data can be used in various applications, such as correcting bias datasets or replacing scarce original data for simulation purposes. Generative Adversarial Networks (GANs) are considered state-of-the-art for developing generative models. However, these deep learning models are data-driven, and it is, thus, difficult to control the generation process. It can, therefore, lead to the following issues: lack of representativity in the generated data, the introduction of bias, and the possibility of overfitting the sample's noise. This article presents the Directed Acyclic Tabular GAN (DATGAN) to address these limitations by integrating expert knowledge in deep learning models for synthetic tabular data generation. This approach allows the interactions between variables to be specified explicitly using a Directed Acyclic Graph (DAG). The DAG is then converted to a network of modified Long Short-Term Memory (LSTM) cells to accept multiple inputs. Multiple DATGAN versions are systematically tested on multiple assessment metrics. We show that the best versions of the DATGAN outperform state-of-the-art generative models on multiple case studies. Finally, we show how the DAG can create hypothetical synthetic datasets.

* 43 pages for the article and 32 pages of supplementary materials. This preprint will soon be submitted

Via

Access Paper or Ask Questions