Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nick Hynes

Towards Efficient Data Valuation Based on the Shapley Value

Feb 27, 2019

Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nick Hynes, Nezihe Merve Gurel, Bo Li, Ce Zhang, Dawn Song, Costas Spanos

Figure 1 for Towards Efficient Data Valuation Based on the Shapley Value

Figure 2 for Towards Efficient Data Valuation Based on the Shapley Value

Figure 3 for Towards Efficient Data Valuation Based on the Shapley Value

Figure 4 for Towards Efficient Data Valuation Based on the Shapley Value

Abstract:"How much is my data worth?" is an increasingly common question posed by organizations and individuals alike. An answer to this question could allow, for instance, fairly distributing profits among multiple data contributors and determining prospective compensation when data breaches happen. In this paper, we study the problem of data valuation by utilizing the Shapley value, a popular notion of value which originated in coopoerative game theory. The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value. However, the Shapley value often requires exponential time to compute. To meet this challenge, we propose a repertoire of efficient algorithms for approximating the Shapley value. We also demonstrate the value of each training instance for various benchmark datasets.

Via

Access Paper or Ask Questions

Efficient Deep Learning on Multi-Source Private Data

Jul 17, 2018

Nick Hynes, Raymond Cheng, Dawn Song

Figure 1 for Efficient Deep Learning on Multi-Source Private Data

Figure 2 for Efficient Deep Learning on Multi-Source Private Data

Figure 3 for Efficient Deep Learning on Multi-Source Private Data

Figure 4 for Efficient Deep Learning on Multi-Source Private Data

Abstract:Machine learning models benefit from large and diverse datasets. Using such datasets, however, often requires trusting a centralized data aggregator. For sensitive applications like healthcare and finance this is undesirable as it could compromise patient privacy or divulge trade secrets. Recent advances in secure and privacy-preserving computation, including trusted hardware enclaves and differential privacy, offer a way for mutually distrusting parties to efficiently train a machine learning model without revealing the training data. In this work, we introduce Myelin, a deep learning framework which combines these privacy-preservation primitives, and use it to establish a baseline level of performance for fully private machine learning.

Via

Access Paper or Ask Questions