Abstract:We study the excess minimum risk in statistical inference, defined as the difference between the minimum expected loss in estimating a random variable from an observed feature vector and the minimum expected loss in estimating the same random variable from a transformation (statistic) of the feature vector. After characterizing lossless transformations, i.e., transformations for which the excess risk is zero for all loss functions, we construct a partitioning test statistic for the hypothesis that a given transformation is lossless and show that for i.i.d. data the test is strongly consistent. More generally, we develop information-theoretic upper bounds on the excess risk that uniformly hold over fairly general classes of loss functions. Based on these bounds, we introduce the notion of a delta-lossless transformation and give sufficient conditions for a given transformation to be universally delta-lossless. Applications to classification, nonparametric regression, portfolio strategies, information bottleneck, and deep learning, are also surveyed.
Abstract:We study the problem nonparametric classification with repeated observations. Let $\bX$ be the $d$ dimensional feature vector and let $Y$ denote the label taking values in $\{1,\dots ,M\}$. In contrast to usual setup with large sample size $n$ and relatively low dimension $d$, this paper deals with the situation, when instead of observing a single feature vector $\bX$ we are given $t$ repeated feature vectors $\bV_1,\dots ,\bV_t $. Some simple classification rules are presented such that the conditional error probabilities have exponential convergence rate of convergence as $t\to\infty$. In the analysis, we investigate particular models like robust detection by nominal densities, prototype classification, linear transformation, linear classification, scaling.
Abstract:In this paper we revisit the classical problem of nonparametric regression, but impose local differential privacy constraints. Under such constraints, the raw data $(X_1,Y_1),\ldots,(X_n,Y_n)$, taking values in $\mathbb{R}^d \times \mathbb{R}$, cannot be directly observed, and all estimators are functions of the randomised output from a suitable privacy mechanism. The statistician is free to choose the form of the privacy mechanism, and here we add Laplace distributed noise to a discretisation of the location of a feature vector $X_i$ and to the value of its response variable $Y_i$. Based on this randomised data, we design a novel estimator of the regression function, which can be viewed as a privatised version of the well-studied partitioning regression estimator. The main result is that the estimator is strongly universally consistent. Our methods and analysis also give rise to a strongly universally consistent binary classification rule for locally differentially private data.