Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Regret Minimization and Statistical Inference in Online Decision Making with High-dimensional Covariates

Nov 10, 2024

Congyuan Duan, Wanteng Ma, Jiashuo Jiang, Dong Xia

Figure 1 for Regret Minimization and Statistical Inference in Online Decision Making with High-dimensional Covariates

Figure 2 for Regret Minimization and Statistical Inference in Online Decision Making with High-dimensional Covariates

Figure 3 for Regret Minimization and Statistical Inference in Online Decision Making with High-dimensional Covariates

Figure 4 for Regret Minimization and Statistical Inference in Online Decision Making with High-dimensional Covariates

Share this with someone who'll enjoy it:

Abstract:This paper investigates regret minimization, statistical inference, and their interplay in high-dimensional online decision-making based on the sparse linear context bandit model. We integrate the $\varepsilon$-greedy bandit algorithm for decision-making with a hard thresholding algorithm for estimating sparse bandit parameters and introduce an inference framework based on a debiasing method using inverse propensity weighting. Under a margin condition, our method achieves either $O(T^{1/2})$ regret or classical $O(T^{1/2})$-consistent inference, indicating an unavoidable trade-off between exploration and exploitation. If a diverse covariate condition holds, we demonstrate that a pure-greedy bandit algorithm, i.e., exploration-free, combined with a debiased estimator based on average weighting can simultaneously achieve optimal $O(\log T)$ regret and $O(T^{1/2})$-consistent inference. We also show that a simple sample mean estimator can provide valid inference for the optimal policy's value. Numerical simulations and experiments on Warfarin dosing data validate the effectiveness of our methods.

View paper on

Share this with someone who'll enjoy it:

Title:Regret Minimization and Statistical Inference in Online Decision Making with High-dimensional Covariates

Paper and Code