Abstract:We develop a technique to design efficiently computable estimators for sparse linear regression in the simultaneous presence of two adversaries: oblivious and adaptive. We design several robust algorithms that outperform the state of the art even in the special case when oblivious adversary simply adds Gaussian noise. In particular, we provide a polynomial-time algorithm that with high probability recovers the signal up to error $O(\sqrt{\varepsilon})$ as long as the number of samples $n \ge \tilde{O}(k^2/\varepsilon)$, only assuming some bounds on the third and the fourth moments of the distribution ${D}$ of the design. In addition, prior to this work, even in the special case of Gaussian design and noise, no polynomial time algorithm was known to achieve error $o(\sqrt{\varepsilon})$ in the sparse setting $n < d^2$. We show that under some assumptions on the fourth and the eighth moments of ${D}$, there is a polynomial-time algorithm that achieves error $o(\sqrt{\varepsilon})$ as long as $n \ge \tilde{O}(k^4 / \varepsilon^3)$. For Gaussian distribution, this algorithm achieves error $O(\varepsilon^{3/4})$. Moreover, our algorithm achieves error $o(\sqrt{\varepsilon})$ for all log-concave distributions if $\varepsilon \le 1/\text{polylog(d)}$. Our algorithms are based on the filtering of the covariates that uses sum-of-squares relaxations, and weighted Huber loss minimization with $\ell_1$ regularizer. We provide a novel analysis of weighted penalized Huber loss that is suitable for heavy-tailed designs in the presence of two adversaries. Furthermore, we complement our algorithmic results with Statistical Query lower bounds, providing evidence that our estimators are likely to have nearly optimal sample complexity.
Abstract:We develop the first pure node-differentially-private algorithms for learning stochastic block models and for graphon estimation with polynomial running time for any constant number of blocks. The statistical utility guarantees match those of the previous best information-theoretic (exponential-time) node-private mechanisms for these problems. The algorithm is based on an exponential mechanism for a score function defined in terms of a sum-of-squares relaxation whose level depends on the number of blocks. The key ingredients of our results are (1) a characterization of the distance between the block graphons in terms of a quadratic optimization over the polytope of doubly stochastic matrices, (2) a general sum-of-squares convergence result for polynomial optimization over arbitrary polytopes, and (3) a general approach to perform Lipschitz extensions of score functions as part of the sum-of-squares algorithmic paradigm.
Abstract:We develop the first fast spectral algorithm to decompose a random third-order tensor over R^d of rank up to O(d^{3/2}/polylog(d)). Our algorithm only involves simple linear algebra operations and can recover all components in time O(d^{6.05}) under the current matrix multiplication time. Prior to this work, comparable guarantees could only be achieved via sum-of-squares [Ma, Shi, Steurer 2016]. In contrast, fast algorithms [Hopkins, Schramm, Shi, Steurer 2016] could only decompose tensors of rank at most O(d^{4/3}/polylog(d)). Our algorithmic result rests on two key ingredients. A clean lifting of the third-order tensor to a sixth-order tensor, which can be expressed in the language of tensor networks. A careful decomposition of the tensor network into a sequence of rectangular matrix multiplications, which allows us to have a fast implementation of the algorithm.
Abstract:We develop machinery to design efficiently computable and consistent estimators, achieving estimation error approaching zero as the number of observations grows, when facing an oblivious adversary that may corrupt responses in all but an $\alpha$ fraction of the samples. As concrete examples, we investigate two problems: sparse regression and principal component analysis (PCA). For sparse regression, we achieve consistency for optimal sample size $n\gtrsim (k\log d)/\alpha^2$ and optimal error rate $O(\sqrt{(k\log d)/(n\cdot \alpha^2)})$ where $n$ is the number of observations, $d$ is the number of dimensions and $k$ is the sparsity of the parameter vector, allowing the fraction of inliers to be inverse-polynomial in the number of samples. Prior to this work, no estimator was known to be consistent when the fraction of inliers $\alpha$ is $o(1/\log \log n)$, even for (non-spherical) Gaussian design matrices. Results holding under weak design assumptions and in the presence of such general noise have only been shown in dense setting (i.e., general linear regression) very recently by d'Orsi et al. [dNS21]. In the context of PCA, we attain optimal error guarantees under broad spikiness assumptions on the parameter matrix (usually used in matrix completion). Previous works could obtain non-trivial guarantees only under the assumptions that the measurement noise corresponding to the inliers is polynomially small in $n$ (e.g., Gaussian with variance $1/n^2$). To devise our estimators, we equip the Huber loss with non-smooth regularizers such as the $\ell_1$ norm or the nuclear norm, and extend d'Orsi et al.'s approach [dNS21] in a novel way to analyze the loss function. Our machinery appears to be easily applicable to a wide range of estimation problems.