Abstract:While score-based diffusion models have achieved exceptional sampling quality, their sampling speeds are often limited by the high computational burden of score function evaluations. Despite the recent remarkable empirical advances in speeding up the score-based samplers, theoretical understanding of acceleration techniques remains largely limited. To bridge this gap, we propose a novel training-free acceleration scheme for stochastic samplers. Under minimal assumptions -- namely, $L^2$-accurate score estimates and a finite second-moment condition on the target distribution -- our accelerated sampler provably achieves $\varepsilon$-accuracy in total variation within $\widetilde{O}(d^{5/4}/\sqrt{\varepsilon})$ iterations, thereby significantly improving upon the $\widetilde{O}(d/\varepsilon)$ iteration complexity of standard score-based samplers. Notably, our convergence theory does not rely on restrictive assumptions on the target distribution or higher-order score estimation guarantees.
Abstract:Multi-armed bandit (MAB) algorithms have achieved significant success in sequential decision-making applications, under the premise that humans perfectly implement the recommended policy. However, existing methods often overlook the crucial factor of human trust in learning algorithms. When trust is lacking, humans may deviate from the recommended policy, leading to undesired learning performance. Motivated by this gap, we study the trust-aware MAB problem by integrating a dynamic trust model into the standard MAB framework. Specifically, it assumes that the recommended and actually implemented policy differs depending on human trust, which in turn evolves with the quality of the recommended policy. We establish the minimax regret in the presence of the trust issue and demonstrate the suboptimality of vanilla MAB algorithms such as the upper confidence bound (UCB) algorithm. To overcome this limitation, we introduce a novel two-stage trust-aware procedure that provably attains near-optimal statistical guarantees. A simulation study is conducted to illustrate the benefits of our proposed algorithm when dealing with the trust issue.
Abstract:Motivated by a range of applications, we study in this paper the problem of transfer learning for nonparametric contextual multi-armed bandits under the covariate shift model, where we have data collected on source bandits before the start of the target bandit learning. The minimax rate of convergence for the cumulative regret is established and a novel transfer learning algorithm that attains the minimax regret is proposed. The results quantify the contribution of the data from the source domains for learning in the target domain in the context of nonparametric contextual multi-armed bandits. In view of the general impossibility of adaptation to unknown smoothness, we develop a data-driven algorithm that achieves near-optimal statistical guarantees (up to a logarithmic factor) while automatically adapting to the unknown parameters over a large collection of parameter spaces under an additional self-similarity assumption. A simulation study is carried out to illustrate the benefits of utilizing the data from the auxiliary source domains for learning in the target domain.
Abstract:Eigenvector perturbation analysis plays a vital role in various statistical data science applications. A large body of prior works, however, focused on establishing $\ell_{2}$ eigenvector perturbation bounds, which are often highly inadequate in addressing tasks that rely on fine-grained behavior of an eigenvector. This paper makes progress on this by studying the perturbation of linear functions of an unknown eigenvector. Focusing on two fundamental problems -- matrix denoising and principal component analysis -- in the presence of Gaussian noise, we develop a suite of statistical theory that characterizes the perturbation of arbitrary linear functions of an unknown eigenvector. In order to mitigate a non-negligible bias issue inherent to the natural "plug-in" estimator, we develop de-biased estimators that (1) achieve minimax lower bounds for a family of scenarios (modulo some logarithmic factor), and (2) can be computed in a data-driven manner without sample splitting. Noteworthily, the proposed estimators are nearly minimax optimal even when the associated eigen-gap is substantially smaller than what is required in prior theory.
Abstract:Q-learning, which seeks to learn the optimal Q-function of a Markov decision process (MDP) in a model-free fashion, lies at the heart of reinforcement learning. When it comes to the synchronous setting (such that independent samples for all state-action pairs are drawn from a generative model in each iteration), substantial progress has been made recently towards understanding the sample efficiency of Q-learning. Take a $\gamma$-discounted infinite-horizon MDP with state space $\mathcal{S}$ and action space $\mathcal{A}$: to yield an entrywise $\varepsilon$-accurate estimate of the optimal Q-function, state-of-the-art theory for Q-learning proves that a sample size on the order of $\frac{|\mathcal{S}||\mathcal{A}|}{(1-\gamma)^5\varepsilon^{2}}$ is sufficient, which, however, fails to match with the existing minimax lower bound. This gives rise to natural questions: what is the sharp sample complexity of Q-learning? Is Q-learning provably sub-optimal? In this work, we settle these questions by (1) demonstrating that the sample complexity of Q-learning is at most on the order of $\frac{|\mathcal{S}||\mathcal{A}|}{(1-\gamma)^4\varepsilon^2}$ (up to some log factor) for any $0<\varepsilon <1$, and (2) developing a matching lower bound to confirm the sharpness of our result. Our findings unveil both the effectiveness and limitation of Q-learning: its sample complexity matches that of speedy Q-learning without requiring extra computation and storage, albeit still being considerably higher than the minimax lower bound.
Abstract:We study the distribution and uncertainty of nonconvex optimization for noisy tensor completion -- the problem of estimating a low-rank tensor given incomplete and corrupted observations of its entries. Focusing on a two-stage estimation algorithm proposed by Cai et al. (2019), we characterize the distribution of this nonconvex estimator down to fine scales. This distributional theory in turn allows one to construct valid and short confidence intervals for both the unseen tensor entries and the unknown tensor factors. The proposed inferential procedure enjoys several important features: (1) it is fully adaptive to noise heteroscedasticity, and (2) it is data-driven and automatically adapts to unknown noise distributions. Furthermore, our findings unveil the statistical optimality of nonconvex tensor completion: it attains un-improvable $\ell_{2}$ accuracy -- including both the rates and the pre-constants -- when estimating both the unknown tensor and the underlying tensor factors.
Abstract:We study a noisy symmetric tensor completion problem of broad practical interest, namely, the reconstruction of a low-rank symmetric tensor from highly incomplete and randomly corrupted observations of its entries. While a variety of prior work has been dedicated to this problem, prior algorithms either are computationally too expensive for large-scale applications, or come with sub-optimal statistical guarantees. Focusing on "incoherent" and well-conditioned tensors of a constant CP rank, we propose a two-stage nonconvex algorithm --- (vanilla) gradient descent following a rough initialization --- that achieves the best of both worlds. Specifically, the proposed nonconvex algorithm faithfully completes the tensor and retrieves all individual tensor factors within nearly linear time, while at the same time enjoying near-optimal statistical guarantees (i.e. minimal sample complexity and optimal estimation accuracy). The estimation errors are evenly spread out across all entries, thus achieving optimal $\ell_{\infty}$ statistical accuracy. The insight conveyed through our analysis of nonconvex optimization might have implications for other tensor estimation problems.
Abstract:This paper is concerned with estimating the column space of an unknown low-rank matrix $\boldsymbol{A}^{\star}\in\mathbb{R}^{d_{1}\times d_{2}}$, given noisy and partial observations of its entries. There is no shortage of scenarios where the observations --- while being too noisy to support faithful recovery of the entire matrix --- still convey sufficient information to enable reliable estimation of the column space of interest. This is particularly evident and crucial for the highly unbalanced case where the column dimension $d_{2}$ far exceeds the row dimension $d_{1}$, which is the focal point of the current paper. We investigate an efficient spectral method, which operates upon the sample Gram matrix with diagonal deletion. We establish statistical guarantees for this method in terms of both $\ell_{2}$ and $\ell_{2,\infty}$ estimation accuracy, which improve upon prior results if $d_{2}$ is substantially larger than $d_{1}$. To illustrate the effectiveness of our findings, we develop consequences of our general theory for three applications of practical importance: (1) tensor completion from noisy data, (2) covariance estimation with missing data, and (3) community recovery in bipartite graphs. Our theory leads to improved performance guarantees for all three cases.