Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Walid Krichene

Google Research

Training Differentially Private Ad Prediction Models with Semi-Sensitive Features

Jan 26, 2024

Lynn Chua, Qiliang Cui, Badih Ghazi, Charlie Harrison, Pritish Kamath, Walid Krichene, Ravi Kumar, Pasin Manurangsi, Krishna Giri Narra, Amer Sinha(+2 more)

Abstract:Motivated by problems arising in digital advertising, we introduce the task of training differentially private (DP) machine learning models with semi-sensitive features. In this setting, a subset of the features is known to the attacker (and thus need not be protected) while the remaining features as well as the label are unknown to the attacker and should be protected by the DP guarantee. This task interpolates between training the model with full DP (where the label and all features should be protected) or with label DP (where all the features are considered known, and only the label should be protected). We present a new algorithm for training DP models with semi-sensitive features. Through an empirical evaluation on real ads datasets, we demonstrate that our algorithm surpasses in utility the baselines of (i) DP stochastic gradient descent (DP-SGD) run on all features (known and unknown), and (ii) a label DP algorithm run only on the known features (while discarding the unknown ones).

* 7 pages, 4 figures

Via

Access Paper or Ask Questions

Private Learning with Public Features

Oct 24, 2023

Walid Krichene, Nicolas Mayoraz, Steffen Rendle, Shuang Song, Abhradeep Thakurta, Li Zhang

Abstract:We study a class of private learning problems in which the data is a join of private and public features. This is often the case in private personalization tasks such as recommendation or ad prediction, in which features related to individuals are sensitive, while features related to items (the movies or songs to be recommended, or the ads to be shown to users) are publicly available and do not require protection. A natural question is whether private algorithms can achieve higher utility in the presence of public features. We give a positive answer for multi-encoder models where one of the encoders operates on public features. We develop new algorithms that take advantage of this separation by only protecting certain sufficient statistics (instead of adding noise to the gradient). This method has a guaranteed utility improvement for linear regression, and importantly, achieves the state of the art on two standard private recommendation benchmarks, demonstrating the importance of methods that adapt to the private-public feature separation.

Via

Access Paper or Ask Questions

Private Matrix Factorization with Public Item Features

Sep 17, 2023

Mihaela Curmei, Walid Krichene, Li Zhang, Mukund Sundararajan

Figure 1 for Private Matrix Factorization with Public Item Features

Figure 2 for Private Matrix Factorization with Public Item Features

Figure 3 for Private Matrix Factorization with Public Item Features

Figure 4 for Private Matrix Factorization with Public Item Features

Abstract:We consider the problem of training private recommendation models with access to public item features. Training with Differential Privacy (DP) offers strong privacy guarantees, at the expense of loss in recommendation quality. We show that incorporating public item features during training can help mitigate this loss in quality. We propose a general approach based on collective matrix factorization (CMF), that works by simultaneously factorizing two matrices: the user feedback matrix (representing sensitive data) and an item feature matrix that encodes publicly available (non-sensitive) item information. The method is conceptually simple, easy to tune, and highly scalable. It can be applied to different types of public item data, including: (1) categorical item features; (2) item-item similarities learned from public sources; and (3) publicly available user feedback. Furthermore, these data modalities can be collectively utilized to fully leverage public data. Evaluating our method on a standard DP recommendation benchmark, we find that using public item features significantly narrows the quality gap between private models and their non-private counterparts. As privacy constraints become more stringent, models rely more heavily on public side features for recommendation. This results in a smooth transition from collaborative filtering to item-based contextual recommendations.

* Presented at ACM Recsys 2023

Via

Access Paper or Ask Questions

Multi-Task Differential Privacy Under Distribution Skew

Feb 15, 2023

Walid Krichene, Prateek Jain, Shuang Song, Mukund Sundararajan, Abhradeep Thakurta, Li Zhang

Abstract:We study the problem of multi-task learning under user-level differential privacy, in which $n$ users contribute data to $m$ tasks, each involving a subset of users. One important aspect of the problem, that can significantly impact quality, is the distribution skew among tasks. Certain tasks may have much fewer data samples than others, making them more susceptible to the noise added for privacy. It is natural to ask whether algorithms can adapt to this skew to improve the overall utility. We give a systematic analysis of the problem, by studying how to optimally allocate a user's privacy budget among tasks. We propose a generic algorithm, based on an adaptive reweighting of the empirical loss, and show that when there is task distribution skew, this gives a quantifiable improvement of excess empirical risk. Experimental studies on recommendation problems that exhibit a long tail of small tasks, demonstrate that our methods significantly improve utility, achieving the state of the art on two standard benchmarks.

Via

Access Paper or Ask Questions

Differentially Private Image Classification from Features

Nov 24, 2022

Harsh Mehta, Walid Krichene, Abhradeep Thakurta, Alexey Kurakin, Ashok Cutkosky

Figure 1 for Differentially Private Image Classification from Features

Figure 2 for Differentially Private Image Classification from Features

Figure 3 for Differentially Private Image Classification from Features

Figure 4 for Differentially Private Image Classification from Features

Abstract:Leveraging transfer learning has recently been shown to be an effective strategy for training large models with Differential Privacy (DP). Moreover, somewhat surprisingly, recent works have found that privately training just the last layer of a pre-trained model provides the best utility with DP. While past studies largely rely on algorithms like DP-SGD for training large models, in the specific case of privately learning from features, we observe that computational burden is low enough to allow for more sophisticated optimization schemes, including second-order methods. To that end, we systematically explore the effect of design parameters such as loss function and optimization algorithm. We find that, while commonly used logistic regression performs better than linear regression in the non-private setting, the situation is reversed in the private setting. We find that linear regression is much more effective than logistic regression from both privacy and computational aspects, especially at stricter epsilon values ($\epsilon < 1$). On the optimization side, we also explore using Newton's method, and find that second-order information is quite helpful even with privacy, although the benefit significantly diminishes with stricter privacy guarantees. While both methods use second-order information, least squares is effective at lower epsilons while Newton's method is effective at larger epsilon values. To combine the benefits of both, we propose a novel algorithm called DP-FC, which leverages feature covariance instead of the Hessian of the logistic regression loss and performs well across all $\epsilon$ values we tried. With this, we obtain new SOTA results on ImageNet-1k, CIFAR-100 and CIFAR-10 across all values of $\epsilon$ typically considered. Most remarkably, on ImageNet-1K, we obtain top-1 accuracy of 88\% under (8, $8 * 10^{-7}$)-DP and 84.3\% under (0.1, $8 * 10^{-7}$)-DP.

Via

Access Paper or Ask Questions

Reciprocity in Machine Learning

Feb 19, 2022

Mukund Sundararajan, Walid Krichene

Figure 1 for Reciprocity in Machine Learning

Figure 2 for Reciprocity in Machine Learning

Figure 3 for Reciprocity in Machine Learning

Figure 4 for Reciprocity in Machine Learning

Abstract:Machine learning is pervasive. It powers recommender systems such as Spotify, Instagram and YouTube, and health-care systems via models that predict sleep patterns, or the risk of disease. Individuals contribute data to these models and benefit from them. Are these contributions (outflows of influence) and benefits (inflows of influence) reciprocal? We propose measures of outflows, inflows and reciprocity building on previously proposed measures of training data influence. Our initial theoretical and empirical results indicate that under certain distributional assumptions, some classes of models are approximately reciprocal. We conclude with several open directions.

Via

Access Paper or Ask Questions

ALX: Large Scale Matrix Factorization on TPUs

Dec 03, 2021

Harsh Mehta, Steffen Rendle, Walid Krichene, Li Zhang

Figure 1 for ALX: Large Scale Matrix Factorization on TPUs

Figure 2 for ALX: Large Scale Matrix Factorization on TPUs

Figure 3 for ALX: Large Scale Matrix Factorization on TPUs

Figure 4 for ALX: Large Scale Matrix Factorization on TPUs

Abstract:We present ALX, an open-source library for distributed matrix factorization using Alternating Least Squares, written in JAX. Our design allows for efficient use of the TPU architecture and scales well to matrix factorization problems of O(B) rows/columns by scaling the number of available TPU cores. In order to spur future research on large scale matrix factorization methods and to illustrate the scalability properties of our own implementation, we also built a real world web link prediction dataset called WebGraph. This dataset can be easily modeled as a matrix factorization problem. We created several variants of this dataset based on locality and sparsity properties of sub-graphs. The largest variant of WebGraph has around 365M nodes and training a single epoch finishes in about 20 minutes with 256 TPU cores. We include speed and performance numbers of ALX on all variants of WebGraph. Both the framework code and the dataset is open-sourced.

Via

Access Paper or Ask Questions

iALS++: Speeding up Matrix Factorization with Subspace Optimization

Oct 26, 2021

Steffen Rendle, Walid Krichene, Li Zhang, Yehuda Koren

Figure 1 for iALS++: Speeding up Matrix Factorization with Subspace Optimization

Figure 2 for iALS++: Speeding up Matrix Factorization with Subspace Optimization

Figure 3 for iALS++: Speeding up Matrix Factorization with Subspace Optimization

Figure 4 for iALS++: Speeding up Matrix Factorization with Subspace Optimization

Abstract:iALS is a popular algorithm for learning matrix factorization models from implicit feedback with alternating least squares. This algorithm was invented over a decade ago but still shows competitive quality compared to recent approaches like VAE, EASE, SLIM, or NCF. Due to a computational trick that avoids negative sampling, iALS is very efficient especially for large item catalogues. However, iALS does not scale well with large embedding dimensions, d, due to its cubic runtime dependency on d. Coordinate descent variations, iCD, have been proposed to lower the complexity to quadratic in d. In this work, we show that iCD approaches are not well suited for modern processors and can be an order of magnitude slower than a careful iALS implementation for small to mid scale embedding sizes (d ~ 100) and only perform better than iALS on large embeddings d ~ 1000. We propose a new solver iALS++ that combines the advantages of iALS in terms of vector processing with a low computational complexity as in iCD. iALS++ is an order of magnitude faster than iCD both for small and large embedding dimensions. It can solve benchmark problems like Movielens 20M or Million Song Dataset even for 1000 dimensional embedding vectors in a few minutes.

Via

Access Paper or Ask Questions

Revisiting the Performance of iALS on Item Recommendation Benchmarks

Oct 26, 2021

Steffen Rendle, Walid Krichene, Li Zhang, Yehuda Koren

Figure 1 for Revisiting the Performance of iALS on Item Recommendation Benchmarks

Figure 2 for Revisiting the Performance of iALS on Item Recommendation Benchmarks

Figure 3 for Revisiting the Performance of iALS on Item Recommendation Benchmarks

Figure 4 for Revisiting the Performance of iALS on Item Recommendation Benchmarks

Abstract:Matrix factorization learned by implicit alternating least squares (iALS) is a popular baseline in recommender system research publications. iALS is known to be one of the most computationally efficient and scalable collaborative filtering methods. However, recent studies suggest that its prediction quality is not competitive with the current state of the art, in particular autoencoders and other item-based collaborative filtering methods. In this work, we revisit the iALS algorithm and present a bag of tricks that we found useful when applying iALS. We revisit four well-studied benchmarks where iALS was reported to perform poorly and show that with proper tuning, iALS is highly competitive and outperforms any method on at least half of the comparisons. We hope that these high quality results together with iALS's known scalability spark new interest in applying and further improving this decade old technique.

Via

Access Paper or Ask Questions

Private Alternating Least Squares: Practical Private Matrix Completion with Tighter Rates

Jul 20, 2021

Steve Chien, Prateek Jain, Walid Krichene, Steffen Rendle, Shuang Song, Abhradeep Thakurta, Li Zhang

Figure 1 for Private Alternating Least Squares: Practical Private Matrix Completion with Tighter Rates

Figure 2 for Private Alternating Least Squares: Practical Private Matrix Completion with Tighter Rates

Figure 3 for Private Alternating Least Squares: Practical Private Matrix Completion with Tighter Rates

Figure 4 for Private Alternating Least Squares: Practical Private Matrix Completion with Tighter Rates

Abstract:We study the problem of differentially private (DP) matrix completion under user-level privacy. We design a joint differentially private variant of the popular Alternating-Least-Squares (ALS) method that achieves: i) (nearly) optimal sample complexity for matrix completion (in terms of number of items, users), and ii) the best known privacy/utility trade-off both theoretically, as well as on benchmark data sets. In particular, we provide the first global convergence analysis of ALS with noise introduced to ensure DP, and show that, in comparison to the best known alternative (the Private Frank-Wolfe algorithm by Jain et al. (2018)), our error bounds scale significantly better with respect to the number of items and users, which is critical in practical problems. Extensive validation on standard benchmarks demonstrate that the algorithm, in combination with carefully designed sampling procedures, is significantly more accurate than existing techniques, thus promising to be the first practical DP embedding model.

Via

Access Paper or Ask Questions