Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lars Nørvang Andersen

Self-Distillation for Gaussian Process Regression and Classification

Apr 05, 2023

Kenneth Borup, Lars Nørvang Andersen

Abstract:We propose two approaches to extend the notion of knowledge distillation to Gaussian Process Regression (GPR) and Gaussian Process Classification (GPC); data-centric and distribution-centric. The data-centric approach resembles most current distillation techniques for machine learning, and refits a model on deterministic predictions from the teacher, while the distribution-centric approach, re-uses the full probabilistic posterior for the next iteration. By analyzing the properties of these approaches, we show that the data-centric approach for GPR closely relates to known results for self-distillation of kernel ridge regression and that the distribution-centric approach for GPR corresponds to ordinary GPR with a very particular choice of hyperparameters. Furthermore, we demonstrate that the distribution-centric approach for GPC approximately corresponds to data duplication and a particular scaling of the covariance and that the data-centric approach for GPC requires redefining the model from a Binomial likelihood to a continuous Bernoulli likelihood to be well-specified. To the best of our knowledge, our proposed approaches are the first to formulate knowledge distillation specifically for Gaussian Process models.

* 10 pages; code at https://github.com/Kennethborup/gaussian_process_self_distillation

Via

Access Paper or Ask Questions

Not all noise is accounted equally: How differentially private learning benefits from large sampling rates

Oct 12, 2021

Friedrich Dörmann, Osvald Frisk, Lars Nørvang Andersen, Christian Fischer Pedersen

Figure 1 for Not all noise is accounted equally: How differentially private learning benefits from large sampling rates

Figure 2 for Not all noise is accounted equally: How differentially private learning benefits from large sampling rates

Figure 3 for Not all noise is accounted equally: How differentially private learning benefits from large sampling rates

Figure 4 for Not all noise is accounted equally: How differentially private learning benefits from large sampling rates

Abstract:Learning often involves sensitive data and as such, privacy preserving extensions to Stochastic Gradient Descent (SGD) and other machine learning algorithms have been developed using the definitions of Differential Privacy (DP). In differentially private SGD, the gradients computed at each training iteration are subject to two different types of noise. Firstly, inherent sampling noise arising from the use of minibatches. Secondly, additive Gaussian noise from the underlying mechanisms that introduce privacy. In this study, we show that these two types of noise are equivalent in their effect on the utility of private neural networks, however they are not accounted for equally in the privacy budget. Given this observation, we propose a training paradigm that shifts the proportions of noise towards less inherent and more additive noise, such that more of the overall noise can be accounted for in the privacy budget. With this paradigm, we are able to improve on the state-of-the-art in the privacy/utility tradeoff of private end-to-end CNNs.

* 2021 IEEE International Workshop on Machine Learning for Signal Processing (MLSP)

Via

Access Paper or Ask Questions