Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lucas Lange

Federated Learning With Individualized Privacy Through Client Sampling

Jan 29, 2025

Lucas Lange, Ole Borchardt, Erhard Rahm

Abstract:With growing concerns about user data collection, individualized privacy has emerged as a promising solution to balance protection and utility by accounting for diverse user privacy preferences. Instead of enforcing a uniform level of anonymization for all users, this approach allows individuals to choose privacy settings that align with their comfort levels. Building on this idea, we propose an adapted method for enabling Individualized Differential Privacy (IDP) in Federated Learning (FL) by handling clients according to their personal privacy preferences. By extending the SAMPLE algorithm from centralized settings to FL, we calculate client-specific sampling rates based on their heterogeneous privacy budgets and integrate them into a modified IDP-FedAvg algorithm. We test this method under realistic privacy distributions and multiple datasets. The experimental results demonstrate that our approach achieves clear improvements over uniform DP baselines, reducing the trade-off between privacy and utility. Compared to the alternative SCALE method in related work, which assigns differing noise scales to clients, our method performs notably better. However, challenges remain for complex tasks with non-i.i.d. data, primarily stemming from the constraints of the decentralized setting.

Via

Access Paper or Ask Questions

Assessing the Impact of Image Dataset Features on Privacy-Preserving Machine Learning

Sep 02, 2024

Lucas Lange, Maurice-Maximilian Heykeroth, Erhard Rahm

Abstract:Machine Learning (ML) is crucial in many sectors, including computer vision. However, ML models trained on sensitive data face security challenges, as they can be attacked and leak information. Privacy-Preserving Machine Learning (PPML) addresses this by using Differential Privacy (DP) to balance utility and privacy. This study identifies image dataset characteristics that affect the utility and vulnerability of private and non-private Convolutional Neural Network (CNN) models. Through analyzing multiple datasets and privacy budgets, we find that imbalanced datasets increase vulnerability in minority classes, but DP mitigates this issue. Datasets with fewer classes improve both model utility and privacy, while high entropy or low Fisher Discriminant Ratio (FDR) datasets deteriorate the utility-privacy trade-off. These insights offer valuable guidance for practitioners and researchers in estimating and optimizing the utility-privacy trade-off in image datasets, helping to inform data and privacy modifications for better outcomes based on dataset characteristics.

Via

Access Paper or Ask Questions

Generating Synthetic Health Sensor Data for Privacy-Preserving Wearable Stress Detection

Jan 24, 2024

Lucas Lange, Nils Wenzlitschke, Erhard Rahm

Abstract:Smartwatch health sensor data is increasingly utilized in smart health applications and patient monitoring, including stress detection. However, such medical data often comprises sensitive personal information and is resource-intensive to acquire for research purposes. In response to this challenge, we introduce the privacy-aware synthetization of multi-sensor smartwatch health readings related to moments of stress. Our method involves the generation of synthetic sequence data through Generative Adversarial Networks (GANs), coupled with the implementation of Differential Privacy (DP) safeguards for protecting patient information during model training. To ensure the integrity of our synthetic data, we employ a range of quality assessments and monitor the plausibility between synthetic and original data. To test the usefulness, we create private machine learning models on a commonly used, albeit small, stress detection dataset, exploring strategies for enhancing the existing data foundation with our synthetic data. Through our GAN-based augmentation methods, we observe improvements in model performance, both in non-private (0.45% F1) and private (11.90-15.48% F1) training scenarios. We underline the potential of differentially private synthetic data in optimizing utility-privacy trade-offs, especially with limited availability of real training samples.

Via

Access Paper or Ask Questions

Privacy in Practice: Private COVID-19 Detection in X-Ray Images

Nov 21, 2022

Lucas Lange, Maja Schneider, Erhard Rahm

Abstract:Machine learning (ML) can help fight the COVID-19 pandemic by enabling rapid screening of large volumes of chest X-ray images. To perform such data analysis while maintaining patient privacy, we create ML models that satisfy Differential Privacy (DP). Previous works exploring private COVID-19 ML models are in part based on small or skewed datasets, are lacking in their privacy guarantees, and do not investigate practical privacy. In this work, we therefore suggest several improvements to address these open gaps. We account for inherent class imbalances in the data and evaluate the utility-privacy trade-off more extensively and over stricter privacy budgets than in previous work. Our evaluation is supported by empirically estimating practical privacy leakage through actual attacks. Based on theory, the introduced DP should help limit and mitigate information leakage threats posed by black-box Membership Inference Attacks (MIAs). Our practical privacy analysis is the first to test this hypothesis on the COVID-19 detection task. In addition, we also re-examine the evaluation on the MNIST database. Our results indicate that based on the task-dependent threat from MIAs, DP does not always improve practical privacy, which we show on the COVID-19 task. The results further suggest that with increasing DP guarantees, empirical privacy leakage reaches an early plateau and DP therefore appears to have a limited impact on MIA defense. Our findings identify possibilities for better utility-privacy trade-offs, and we thus believe that empirical attack-specific privacy estimation can play a vital role in tuning for practical privacy.

Via

Access Paper or Ask Questions