Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Assumed Density Filtering Q-learning

Oct 05, 2018

Heejin Jeong, Clark Zhang, George J. Pappas, Daniel D. Lee

Figure 1 for Assumed Density Filtering Q-learning

Figure 2 for Assumed Density Filtering Q-learning

Figure 3 for Assumed Density Filtering Q-learning

Figure 4 for Assumed Density Filtering Q-learning

Share this with someone who'll enjoy it:

Abstract:While off-policy temporal difference (TD) methods have widely been used in reinforcement learning due to their efficiency and simple implementation, their Bayesian counterparts have not been utilized as frequently. One reason is that the non-linear max operation in the Bellman optimality equation makes it difficult to define conjugate distributions over the value functions. In this paper, we introduce a novel Bayesian approach to off-policy TD methods, called as ADFQ, which updates beliefs on state-action values, Q, through an online Bayesian inference method known as Assumed Density Filtering. In order to formulate a closed-form update, we approximately estimate analytic parameters of the posterior of the Q-beliefs. Uncertainty measures in the beliefs not only are used in exploration but also provide a natural regularization for learning. We show that ADFQ converges to Q-learning as the uncertainty measures of the Q-beliefs decrease. ADFQ improves common drawbacks of other Bayesian RL algorithms such as computational complexity. We also extend ADFQ with a neural network. Our empirical results demonstrate that the proposed ADFQ algorithm outperforms comparable algorithms on various domains including continuous state domains and games from the Arcade Learning Environment.

* source code: https://github.com/coco66/ADFQ.git

View paper on

Share this with someone who'll enjoy it:

Title:Assumed Density Filtering Q-learning

Paper and Code