Abstract:Data is an increasingly vital component of decision making processes across industries. However, data access raises privacy concerns motivating the need for privacy-preserving techniques such as differential privacy. Data markets provide a means to enable wider access as well as determine the appropriate privacy-utility trade-off. Existing data market frameworks either require a trusted third party to perform computationally expensive valuations or are unable to capture the combinatorial nature of data value and do not endogenously model the effect of differential privacy. This paper addresses these shortcomings by proposing a valuation mechanism based on the Wasserstein distance for differentially-private data, and corresponding procurement mechanisms by leveraging incentive mechanism design theory, for task-agnostic data procurement, and task-specific procurement co-optimisation. The mechanisms are reformulated into tractable mixed-integer second-order cone programs, which are validated with numerical studies.
Abstract:We provide an exact expressions for the 1-Wasserstein distance between independent location-scale distributions. The expressions are represented using location and scale parameters and special functions such as the standard Gaussian CDF or the Gamma function. Specifically, we find that the 1-Wasserstein distance between independent univariate location-scale distributions is equivalent to the mean of a folded distribution within the same family whose underlying location and scale are equal to the difference of the locations and scales of the original distributions. A new linear upper bound on the 1-Wasserstein distance is presented and the asymptotic bounds of the 1-Wasserstein distance are detailed in the Gaussian case. The effect of differential privacy using the Laplace and Gaussian mechanisms on the 1-Wasserstein distance is studied using the closed-form expressions and bounds.