Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Squeeze All: Novel Estimator and Self-Normalized Bound for Linear Contextual Bandits

Jun 16, 2022

Wonyoung Kim, Min-hwan Oh, Myunghee Cho Paik

Figure 1 for Squeeze All: Novel Estimator and Self-Normalized Bound for Linear Contextual Bandits

Figure 2 for Squeeze All: Novel Estimator and Self-Normalized Bound for Linear Contextual Bandits

Share this with someone who'll enjoy it:

Abstract:We propose a novel algorithm for linear contextual bandits with $O(\sqrt{dT \log T})$ regret bound, where $d$ is the dimension of contexts and $T$ is the time horizon. Our proposed algorithm is equipped with a novel estimator in which exploration is embedded through explicit randomization. Depending on the randomization, our proposed estimator takes contribution either from contexts of all arms or from selected contexts. We establish a self-normalized bound for our estimator, which allows a novel decomposition of the cumulative regret into additive dimension-dependent terms instead of multiplicative terms. We also prove a novel lower bound of $\Omega(\sqrt{dT})$ under our problem setting. Hence, the regret of our proposed algorithm matches the lower bound up to logarithmic factors. The numerical experiments support the theoretical guarantees and show that our proposed method outperforms the existing linear bandit algorithms.

* 27 pages including Appendix

View paper on

Share this with someone who'll enjoy it:

Title:Squeeze All: Novel Estimator and Self-Normalized Bound for Linear Contextual Bandits

Paper and Code