Abstract:We consider the problem of binary classification in a framework where the predictor $X$ takes values in an arbitrary separable metric space $\mathcal X$ and the label $Y$ values in $\{ \pm 1 \}$. In the first part of this work, we assume that one has direct access to an i.i.d. sample $(X_1,Y_1),\ldots,(X_n,Y_n)$ from the unknown distribution of the pair $(X,Y)$. We derive a convergence rate for the Proto-NN classifier which was recently introduced as a classifier in the presence of metric space-valued predictors. In the second part of the paper, we reconsider the same problem under an additional privacy constraint. More precisely, we work in the framework of local differential privacy where one assumes that the data $(X_1,Y_1),\ldots,(X_n,Y_n)$ cannot be directly observed but only a privatised surrogate obtained through a suitable mechanism satisfying the privacy constraint is available. The statistician should select an optimal privacy mechanism from the class of all mechanism that guarantee local differential privacy. Our method of choice is to add Laplace distributed noise to both a set of in Proto-NN classifier using the privatised data only is universally consistent. Finally, a rate of convergence for the privatised Proto-NN classifier is derived.
Abstract:We study the excess minimum risk in statistical inference, defined as the difference between the minimum expected loss in estimating a random variable from an observed feature vector and the minimum expected loss in estimating the same random variable from a transformation (statistic) of the feature vector. After characterizing lossless transformations, i.e., transformations for which the excess risk is zero for all loss functions, we construct a partitioning test statistic for the hypothesis that a given transformation is lossless and show that for i.i.d. data the test is strongly consistent. More generally, we develop information-theoretic upper bounds on the excess risk that uniformly hold over fairly general classes of loss functions. Based on these bounds, we introduce the notion of a delta-lossless transformation and give sufficient conditions for a given transformation to be universally delta-lossless. Applications to classification, nonparametric regression, portfolio strategies, information bottleneck, and deep learning, are also surveyed.
Abstract:We study the problem nonparametric classification with repeated observations. Let $\bX$ be the $d$ dimensional feature vector and let $Y$ denote the label taking values in $\{1,\dots ,M\}$. In contrast to usual setup with large sample size $n$ and relatively low dimension $d$, this paper deals with the situation, when instead of observing a single feature vector $\bX$ we are given $t$ repeated feature vectors $\bV_1,\dots ,\bV_t $. Some simple classification rules are presented such that the conditional error probabilities have exponential convergence rate of convergence as $t\to\infty$. In the analysis, we investigate particular models like robust detection by nominal densities, prototype classification, linear transformation, linear classification, scaling.
Abstract:In this paper we revisit the classical problem of nonparametric regression, but impose local differential privacy constraints. Under such constraints, the raw data $(X_1,Y_1),\ldots,(X_n,Y_n)$, taking values in $\mathbb{R}^d \times \mathbb{R}$, cannot be directly observed, and all estimators are functions of the randomised output from a suitable privacy mechanism. The statistician is free to choose the form of the privacy mechanism, and here we add Laplace distributed noise to a discretisation of the location of a feature vector $X_i$ and to the value of its response variable $Y_i$. Based on this randomised data, we design a novel estimator of the regression function, which can be viewed as a privatised version of the well-studied partitioning regression estimator. The main result is that the estimator is strongly universally consistent. Our methods and analysis also give rise to a strongly universally consistent binary classification rule for locally differentially private data.