Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shuli Jiang

Richard

The Cost of Shuffling in Private Gradient Based Optimization

Feb 05, 2025

Shuli Jiang, Pranay Sharma, Zhiwei Steven Wu, Gauri Joshi

Figure 1 for The Cost of Shuffling in Private Gradient Based Optimization

Figure 2 for The Cost of Shuffling in Private Gradient Based Optimization

Figure 3 for The Cost of Shuffling in Private Gradient Based Optimization

Figure 4 for The Cost of Shuffling in Private Gradient Based Optimization

Abstract:We consider the problem of differentially private (DP) convex empirical risk minimization (ERM). While the standard DP-SGD algorithm is theoretically well-established, practical implementations often rely on shuffled gradient methods that traverse the training data sequentially rather than sampling with replacement in each iteration. Despite their widespread use, the theoretical privacy-accuracy trade-offs of private shuffled gradient methods (\textit{DP-ShuffleG}) remain poorly understood, leading to a gap between theory and practice. In this work, we leverage privacy amplification by iteration (PABI) and a novel application of Stein's lemma to provide the first empirical excess risk bound of \textit{DP-ShuffleG}. Our result shows that data shuffling results in worse empirical excess risk for \textit{DP-ShuffleG} compared to DP-SGD. To address this limitation, we propose \textit{Interleaved-ShuffleG}, a hybrid approach that integrates public data samples in private optimization. By alternating optimization steps that use private and public samples, \textit{Interleaved-ShuffleG} effectively reduces empirical excess risk. Our analysis introduces a new optimization framework with surrogate objectives, adaptive noise injection, and a dissimilarity metric, which can be of independent interest. Our experiments on diverse datasets and tasks demonstrate the superiority of \textit{Interleaved-ShuffleG} over several baselines.

* 54 pages, 6 figures

Via

Access Paper or Ask Questions

Optimized Tradeoffs for Private Prediction with Majority Ensembling

Nov 27, 2024

Shuli Jiang, Qiuyi, Zhang, Gauri Joshi

Figure 1 for Optimized Tradeoffs for Private Prediction with Majority Ensembling

Figure 2 for Optimized Tradeoffs for Private Prediction with Majority Ensembling

Figure 3 for Optimized Tradeoffs for Private Prediction with Majority Ensembling

Figure 4 for Optimized Tradeoffs for Private Prediction with Majority Ensembling

Abstract:We study a classical problem in private prediction, the problem of computing an $(m\epsilon, \delta)$-differentially private majority of $K$ $(\epsilon, \Delta)$-differentially private algorithms for $1 \leq m \leq K$ and $1 > \delta \geq \Delta \geq 0$. Standard methods such as subsampling or randomized response are widely used, but do they provide optimal privacy-utility tradeoffs? To answer this, we introduce the Data-dependent Randomized Response Majority (DaRRM) algorithm. It is parameterized by a data-dependent noise function $\gamma$, and enables efficient utility optimization over the class of all private algorithms, encompassing those standard methods. We show that maximizing the utility of an $(m\epsilon, \delta)$-private majority algorithm can be computed tractably through an optimization problem for any $m \leq K$ by a novel structural result that reduces the infinitely many privacy constraints into a polynomial set. In some settings, we show that DaRRM provably enjoys a privacy gain of a factor of 2 over common baselines, with fixed utility. Lastly, we demonstrate the strong empirical effectiveness of our first-of-its-kind privacy-constrained utility optimization for ensembling labels for private prediction from private teachers in image classification. Notably, our DaRRM framework with an optimized $\gamma$ exhibits substantial utility gains when compared against several baselines.

* 57 pages, 10 figures. Proceedings of Transactions on Machine Learning Research (TMLR), November 2024

Via

Access Paper or Ask Questions

Turning Generative Models Degenerate: The Power of Data Poisoning Attacks

Jul 18, 2024

Shuli Jiang, Swanand Ravindra Kadhe, Yi Zhou, Farhan Ahmed, Ling Cai, Nathalie Baracaldo

Figure 1 for Turning Generative Models Degenerate: The Power of Data Poisoning Attacks

Figure 2 for Turning Generative Models Degenerate: The Power of Data Poisoning Attacks

Figure 3 for Turning Generative Models Degenerate: The Power of Data Poisoning Attacks

Figure 4 for Turning Generative Models Degenerate: The Power of Data Poisoning Attacks

Abstract:The increasing use of large language models (LLMs) trained by third parties raises significant security concerns. In particular, malicious actors can introduce backdoors through poisoning attacks to generate undesirable outputs. While such attacks have been extensively studied in image domains and classification tasks, they remain underexplored for natural language generation (NLG) tasks. To address this gap, we conduct an investigation of various poisoning techniques targeting the LLM's fine-tuning phase via prefix-tuning, a Parameter Efficient Fine-Tuning (PEFT) method. We assess their effectiveness across two generative tasks: text summarization and text completion; and we also introduce new metrics to quantify the success and stealthiness of such NLG poisoning attacks. Through our experiments, we find that the prefix-tuning hyperparameters and trigger designs are the most crucial factors to influence attack success and stealthiness. Moreover, we demonstrate that existing popular defenses are ineffective against our poisoning attacks. Our study presents the first systematic approach to understanding poisoning attacks targeting NLG tasks during fine-tuning via PEFT across a wide range of triggers and attack settings. We hope our findings will aid the AI security community in developing effective defenses against such threats.

* 18 pages, 11 figures

Via

Access Paper or Ask Questions

Forcing Generative Models to Degenerate Ones: The Power of Data Poisoning Attacks

Dec 07, 2023

Shuli Jiang, Swanand Ravindra Kadhe, Yi Zhou, Ling Cai, Nathalie Baracaldo

Figure 1 for Forcing Generative Models to Degenerate Ones: The Power of Data Poisoning Attacks

Figure 2 for Forcing Generative Models to Degenerate Ones: The Power of Data Poisoning Attacks

Figure 3 for Forcing Generative Models to Degenerate Ones: The Power of Data Poisoning Attacks

Figure 4 for Forcing Generative Models to Degenerate Ones: The Power of Data Poisoning Attacks

Abstract:Growing applications of large language models (LLMs) trained by a third party raise serious concerns on the security vulnerability of LLMs.It has been demonstrated that malicious actors can covertly exploit these vulnerabilities in LLMs through poisoning attacks aimed at generating undesirable outputs. While poisoning attacks have received significant attention in the image domain (e.g., object detection), and classification tasks, their implications for generative models, particularly in the realm of natural language generation (NLG) tasks, remain poorly understood. To bridge this gap, we perform a comprehensive exploration of various poisoning techniques to assess their effectiveness across a range of generative tasks. Furthermore, we introduce a range of metrics designed to quantify the success and stealthiness of poisoning attacks specifically tailored to NLG tasks. Through extensive experiments on multiple NLG tasks, LLMs and datasets, we show that it is possible to successfully poison an LLM during the fine-tuning stage using as little as 1\% of the total tuning data samples. Our paper presents the first systematic approach to comprehend poisoning attacks targeting NLG tasks considering a wide range of triggers and attack settings. We hope our findings will assist the AI security community in devising appropriate defenses against such threats.

* 19 pages, 6 figures. Published at NeurIPS 2023 Workshop on Backdoors in Deep Learning: The Good, the Bad, and the Ugly

Via

Access Paper or Ask Questions

Correlation Aware Sparsified Mean Estimation Using Random Projection

Oct 29, 2023

Shuli Jiang, Pranay Sharma, Gauri Joshi

Abstract:We study the problem of communication-efficient distributed vector mean estimation, a commonly used subroutine in distributed optimization and Federated Learning (FL). Rand-$k$ sparsification is a commonly used technique to reduce communication cost, where each client sends $k < d$ of its coordinates to the server. However, Rand-$k$ is agnostic to any correlations, that might exist between clients in practical scenarios. The recently proposed Rand-$k$-Spatial estimator leverages the cross-client correlation information at the server to improve Rand-$k$'s performance. Yet, the performance of Rand-$k$-Spatial is suboptimal. We propose the Rand-Proj-Spatial estimator with a more flexible encoding-decoding procedure, which generalizes the encoding of Rand-$k$ by projecting the client vectors to a random $k$-dimensional subspace. We utilize Subsampled Randomized Hadamard Transform (SRHT) as the projection matrix and show that Rand-Proj-Spatial with SRHT outperforms Rand-$k$-Spatial, using the correlation information more efficiently. Furthermore, we propose an approach to incorporate varying degrees of correlation and suggest a practical variant of Rand-Proj-Spatial when the correlation information is not available to the server. Experiments on real-world distributed optimization tasks showcase the superior performance of Rand-Proj-Spatial compared to Rand-$k$-Spatial and other more sophisticated sparsification techniques.

* 32 pages, 13 figures. Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS 2023), New Orleans, USA

Via

Access Paper or Ask Questions

D.MCA: Outlier Detection with Explicit Micro-Cluster Assignments

Oct 15, 2022

Shuli Jiang, Robson Leonardo Ferreira Cordeiro, Leman Akoglu

Figure 1 for D.MCA: Outlier Detection with Explicit Micro-Cluster Assignments

Figure 2 for D.MCA: Outlier Detection with Explicit Micro-Cluster Assignments

Figure 3 for D.MCA: Outlier Detection with Explicit Micro-Cluster Assignments

Figure 4 for D.MCA: Outlier Detection with Explicit Micro-Cluster Assignments

Abstract:How can we detect outliers, both scattered and clustered, and also explicitly assign them to respective micro-clusters, without knowing apriori how many micro-clusters exist? How can we perform both tasks in-house, i.e., without any post-hoc processing, so that both detection and assignment can benefit simultaneously from each other? Presenting outliers in separate micro-clusters is informative to analysts in many real-world applications. However, a na\"ive solution based on post-hoc clustering of the outliers detected by any existing method suffers from two main drawbacks: (a) appropriate hyperparameter values are commonly unknown for clustering, and most algorithms struggle with clusters of varying shapes and densities; (b) detection and assignment cannot benefit from one another. In this paper, we propose D.MCA to $\underline{D}$etect outliers with explicit $\underline{M}$icro-$\underline{C}$luster $\underline{A}$ssignment. Our method performs both detection and assignment iteratively, and in-house, by using a novel strategy that prunes entire micro-clusters out of the training set to improve the performance of the detection. It also benefits from a novel strategy that avoids clustered outliers to mask each other, which is a well-known problem in the literature. Also, D.MCA is designed to be robust to a critical hyperparameter by employing a hyperensemble "warm up" phase. Experiments performed on 16 real-world and synthetic datasets demonstrate that D.MCA outperforms 8 state-of-the-art competitors, especially on the explicit outlier micro-cluster assignment task.

* Proceedings of the 22nd IEEE International Conference on Data Mining (ICDM 2022)

Via

Access Paper or Ask Questions