Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Efficient Multivariate Bandit Algorithm with Path Planning

Sep 06, 2019

Keyu Nie, Zezhong Zhang, Ted Tao Yuan, Rong Song, Pauline Berry Burke

Figure 1 for Efficient Multivariate Bandit Algorithm with Path Planning

Figure 2 for Efficient Multivariate Bandit Algorithm with Path Planning

Figure 3 for Efficient Multivariate Bandit Algorithm with Path Planning

Figure 4 for Efficient Multivariate Bandit Algorithm with Path Planning

Share this with someone who'll enjoy it:

Abstract:In this paper, we solve the arms exponential exploding issue in multivariate Multi-Armed Bandit (Multivariate-MAB) problem when the arm dimension hierarchy is considered. We propose a framework called path planning (TS-PP) which utilizes decision graph/trees to model arm reward success rate with m-way dimension interaction, and adopts Thompson sampling (TS) for heuristic search of arm selection. Naturally, it is quite straightforward to combat the curse of dimensionality using a serial processes that operates sequentially by focusing on one dimension per each process. For our best acknowledge, we are the first to solve Multivariate-MAB problem using graph path planning strategy and deploying alike Monte-Carlo tree search ideas. Our proposed method utilizing tree models has advantages comparing with traditional models such as general linear regression. Simulation studies validate our claim by achieving faster convergence speed, better efficient optimal arm allocation and lower cumulative regret.

* Multi-Armed Bandit, Monte Carlo Tree Search, Decision Tree, Path Planning

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Efficient Multivariate Bandit Algorithm with Path Planning

Paper and Code