Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Towards Principled Assessment of Tabular Data Synthesis Algorithms

Feb 09, 2024

Yuntao Du, Ninghui Li

Figure 1 for Towards Principled Assessment of Tabular Data Synthesis Algorithms

Figure 2 for Towards Principled Assessment of Tabular Data Synthesis Algorithms

Figure 3 for Towards Principled Assessment of Tabular Data Synthesis Algorithms

Figure 4 for Towards Principled Assessment of Tabular Data Synthesis Algorithms

Share this with someone who'll enjoy it:

Abstract:Data synthesis has been advocated as an important approach for utilizing data while protecting data privacy. A large number of tabular data synthesis algorithms (which we call synthesizers) have been proposed. Some synthesizers satisfy Differential Privacy, while others aim to provide privacy in a heuristic fashion. A comprehensive understanding of the strengths and weaknesses of these synthesizers remains elusive due to lacking principled evaluation metrics and missing head-to-head comparisons of newly developed synthesizers that take advantage of diffusion models and large language models with state-of-the-art marginal-based synthesizers. In this paper, we present a principled and systematic evaluation framework for assessing tabular data synthesis algorithms. Specifically, we examine and critique existing evaluation metrics, and introduce a set of new metrics in terms of fidelity, privacy, and utility to address their limitations. Based on the proposed metrics, we also devise a unified objective for tuning, which can consistently improve the quality of synthetic data for all methods. We conducted extensive evaluations of 8 different types of synthesizers on 12 datasets and identified some interesting findings, which offer new directions for privacy-preserving data synthesis.

* The code is available at: https://github.com/zealscott/SynMeter

View paper on

Share this with someone who'll enjoy it:

Title:Towards Principled Assessment of Tabular Data Synthesis Algorithms

Paper and Code