Abstract:The least squares (LS) estimate is the archetypical solution of linear regression problems. The asymptotic Gaussianity of the scaled LS error is often used to construct approximate confidence ellipsoids around the LS estimate, however, for finite samples these ellipsoids do not come with strict guarantees, unless some strong assumptions are made on the noise distributions. The paper studies the distribution-free Sign-Perturbed Sums (SPS) ellipsoidal outer approximation (EOA) algorithm which can construct non-asymptotically guaranteed confidence ellipsoids under mild assumptions, such as independent and symmetric noise terms. These ellipsoids have the same center and orientation as the classical asymptotic ellipsoids, only their radii are different, which radii can be computed by convex optimization. Here, we establish high probability non-asymptotic upper bounds for the sizes of SPS outer ellipsoids for linear regression problems and show that the volumes of these ellipsoids decrease at the optimal rate. Finally, the difference between our theoretical bounds and the empirical sizes of the regions are investigated experimentally.
Abstract:Stochastic multi-armed bandits (MABs) provide a fundamental reinforcement learning model to study sequential decision making in uncertain environments. The upper confidence bounds (UCB) algorithm gave birth to the renaissance of bandit algorithms, as it achieves near-optimal regret rates under various moment assumptions. Up until recently most UCB methods relied on concentration inequalities leading to confidence bounds which depend on moment parameters, such as the variance proxy, that are usually unknown in practice. In this paper, we propose a new distribution-free, data-driven UCB algorithm for symmetric reward distributions, which needs no moment information. The key idea is to combine a refined, one-sided version of the recently developed resampled median-of-means (RMM) method with UCB. We prove a near-optimal regret bound for the proposed anytime, parameter-free RMM-UCB method, even for heavy-tailed distributions.
Abstract:This letter studies a distribution-free, finite-sample data perturbation (DP) method, the Residual-Permuted Sums (RPS), which is an alternative of the Sign-Perturbed Sums (SPS) algorithm, to construct confidence regions. While SPS assumes independent (but potentially time-varying) noise terms which are symmetric about zero, RPS gets rid of the symmetricity assumption, but assumes i.i.d. noises. The main idea is that RPS permutes the residuals instead of perturbing their signs. This letter introduces RPS in a flexible way, which allows various design-choices. RPS has exact finite sample coverage probabilities and we provide the first proof that these permutation-based confidence regions are uniformly strongly consistent under general assumptions. This means that the RPS regions almost surely shrink around the true parameters as the sample size increases. The ellipsoidal outer-approximation (EOA) of SPS is also extended to RPS, and the effectiveness of RPS is validated by numerical experiments, as well.
Abstract:Sign-Perturbed Sum (SPS) is a powerful finite-sample system identification algorithm which can construct confidence regions for the true data generating system with exact coverage probabilities, for any finite sample size. SPS was developed in a series of papers and it has a wide range of applications, from general linear systems, even in a closed-loop setup, to nonlinear and nonparametric approaches. Although several theoretical properties of SPS were proven in the literature, the sample complexity of the method was not analysed so far. This paper aims to fill this gap and provides the first results on the sample complexity of SPS. Here, we focus on scalar linear regression problems, that is we study the behaviour of SPS confidence intervals. We provide high probability upper bounds, under three different sets of assumptions, showing that the sizes of SPS confidence intervals shrink at a geometric rate around the true parameter, if the observation noises are subgaussian. We also show that similar bounds hold for the previously proposed outer approximation of the confidence region. Finally, we present simulation experiments comparing the theoretical and the empirical convergence rates.