Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ho Chung Leon Law

Hyperparameter Learning via Distributional Transfer

Oct 15, 2018

Ho Chung Leon Law, Peilin Zhao, Junzhou Huang, Dino Sejdinovic

Figure 1 for Hyperparameter Learning via Distributional Transfer

Abstract:Bayesian optimisation is a popular technique for hyperparameter learning but typically requires initial 'exploration' even in cases where potentially similar prior tasks have been solved. We propose to transfer information across tasks using kernel embeddings of distributions of training datasets used in those tasks. The resulting method has a faster convergence compared to existing baselines, in some cases requiring only a few evaluations of the target objective.

Via

Access Paper or Ask Questions

A Differentially Private Kernel Two-Sample Test

Aug 01, 2018

Anant Raj, Ho Chung Leon Law, Dino Sejdinovic, Mijung Park

Figure 1 for A Differentially Private Kernel Two-Sample Test

Figure 2 for A Differentially Private Kernel Two-Sample Test

Figure 3 for A Differentially Private Kernel Two-Sample Test

Figure 4 for A Differentially Private Kernel Two-Sample Test

Abstract:Kernel two-sample testing is a useful statistical tool in determining whether data samples arise from different distributions without imposing any parametric assumptions on those distributions. However, raw data samples can expose sensitive information about individuals who participate in scientific studies, which makes the current tests vulnerable to privacy breaches. Hence, we design a new framework for kernel two-sample testing conforming to differential privacy constraints, in order to guarantee the privacy of subjects in the data. Unlike existing differentially private parametric tests that simply add noise to data, kernel-based testing imposes a challenge due to a complex dependence of test statistics on the raw data, as these statistics correspond to estimators of distances between representations of probability measures in Hilbert spaces. Our approach considers finite dimensional approximations to those representations. As a result, a simple chi-squared test is obtained, where a test statistic depends on a mean and covariance of empirical differences between the samples, which we perturb for a privacy guarantee. We investigate the utility of our framework in two realistic settings and conclude that our method requires only a relatively modest increase in sample size to achieve a similar level of power to the non-private tests in both settings.

Via

Access Paper or Ask Questions

Variational Learning on Aggregate Outputs with Gaussian Processes

May 22, 2018

Ho Chung Leon Law, Dino Sejdinovic, Ewan Cameron, Tim CD Lucas, Seth Flaxman, Katherine Battle, Kenji Fukumizu

Figure 1 for Variational Learning on Aggregate Outputs with Gaussian Processes

Figure 2 for Variational Learning on Aggregate Outputs with Gaussian Processes

Abstract:While a typical supervised learning framework assumes that the inputs and the outputs are measured at the same levels of granularity, many applications, including global mapping of disease, only have access to outputs at a much coarser level than that of the inputs. Aggregation of outputs makes generalization to new inputs much more difficult. We consider an approach to this problem based on variational learning with a model of output aggregation and Gaussian processes, where aggregation leads to intractability of the standard evidence lower bounds. We propose new bounds and tractable approximations, leading to improved prediction accuracy and scalability to large datasets, while explicitly taking uncertainty into account. We develop a framework which extends to several types of likelihoods, including the Poisson model for aggregated count data. We apply our framework to a challenging and important problem, the fine-scale spatial modelling of malaria incidence, with over 1 million observations.

Via

Access Paper or Ask Questions

Bayesian Approaches to Distribution Regression

Feb 22, 2018

Ho Chung Leon Law, Dougal J. Sutherland, Dino Sejdinovic, Seth Flaxman

Figure 1 for Bayesian Approaches to Distribution Regression

Abstract:Distribution regression has recently attracted much interest as a generic solution to the problem of supervised learning where labels are available at the group level, rather than at the individual level. Current approaches, however, do not propagate the uncertainty in observations due to sampling variability in the groups. This effectively assumes that small and large groups are estimated equally well, and should have equal weight in the final regression. We account for this uncertainty with a Bayesian distribution regression formalism, improving the robustness and performance of the model when group sizes vary. We frame our models in a neural network style, allowing for simple MAP inference using backpropagation to learn the parameters, as well as MCMC-based inference which can fully propagate uncertainty. We demonstrate our approach on illustrative toy datasets, as well as on a challenging problem of predicting age from images.

* Final version to be published at AISTATS 2018

Via

Access Paper or Ask Questions

Testing and Learning on Distributions with Symmetric Noise Invariance

Nov 05, 2017

Ho Chung Leon Law, Christopher Yau, Dino Sejdinovic

Figure 1 for Testing and Learning on Distributions with Symmetric Noise Invariance

Figure 2 for Testing and Learning on Distributions with Symmetric Noise Invariance

Abstract:Kernel embeddings of distributions and the Maximum Mean Discrepancy (MMD), the resulting distance between distributions, are useful tools for fully nonparametric two-sample testing and learning on distributions. However, it is rarely that all possible differences between samples are of interest -- discovered differences can be due to different types of measurement noise, data collection artefacts or other irrelevant sources of variability. We propose distances between distributions which encode invariance to additive symmetric noise, aimed at testing whether the assumed true underlying processes differ. Moreover, we construct invariant features of distributions, leading to learning algorithms robust to the impairment of the input distributions with symmetric additive noise.

* 22 pages

Via

Access Paper or Ask Questions