Abstract:While machine learning has become pervasive in as diversified fields as industry, healthcare, social networks, privacy concerns regarding the training data have gained a critical importance. In settings where several parties wish to collaboratively train a common model without jeopardizing their sensitive data, the need for a private training protocol is particularly stringent and implies to protect the data against both the model's end-users and the actors of the training phase. Differential privacy (DP) and cryptographic primitives are complementary popular countermeasures against privacy attacks. Among these cryptographic primitives, fully homomorphic encryption (FHE) offers ciphertext malleability at the cost of time-consuming operations in the homomorphic domain. In this paper, we design SHIELD, a probabilistic approximation algorithm for the argmax operator which is both fast when homomorphically executed and whose inaccuracy is used as a feature to ensure DP guarantees. Even if SHIELD could have other applications, we here focus on one setting and seamlessly integrate it in the SPEED collaborative training framework from "SPEED: Secure, PrivatE, and Efficient Deep learning" (Grivet S\'ebert et al., 2021) to improve its computational efficiency. After thoroughly describing the FHE implementation of our algorithm and its DP analysis, we present experimental results. To the best of our knowledge, it is the first work in which relaxing the accuracy of an homomorphic calculation is constructively usable as a degree of freedom to achieve better FHE performances.
Abstract:This paper tackles the problem of ensuring training data privacy in a federated learning context. Relying on Fully Homomorphic Encryption (FHE) and Differential Privacy (DP), we propose a secure framework addressing an extended threat model with respect to privacy of the training data. Notably, the proposed framework protects the privacy of the training data from all participants, namely the training data owners and an aggregating server. In details, while homomorphic encryption blinds a semi-honest server at learning stage, differential privacy protects the data from semi-honest clients participating in the training process as well as curious end-users with black-box or white-box access to the trained model. This paper provides with new theoretical and practical results to enable these techniques to be effectively combined. In particular, by means of a novel stochastic quantization operator, we prove differential privacy guarantees in a context where the noise is quantified and bounded due to the use of homomorphic encryption. The paper is concluded by experiments which show the practicality of the entire framework in spite of these interferences in terms of both model quality (impacted by DP) and computational overheads (impacted by FHE).
Abstract:This paper addresses the issue of collaborative deep learning with privacy constraints. Building upon differentially private decentralized semi-supervised learning, we introduce homomorphically encrypted operations to extend the set of threats considered so far. While previous methods relied on the existence of an hypothetical 'trusted' third party, we designed specific aggregation operations in the encrypted domain that allow us to circumvent this assumption. This makes our method practical to real-life scenario where data holders do not trust any third party to process their datasets. Crucially the computational burden of the approach is maintained reasonable, making it suitable to deep learning applications. In order to illustrate the performances of our method, we carried out numerical experiments using image datasets in a classification context.