Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning

Oct 10, 2019

Che Wang, Yanqiu Wu, Quan Vuong, Keith Ross

Figure 1 for Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning

Figure 2 for Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning

Figure 3 for Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning

Figure 4 for Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning

Share this with someone who'll enjoy it:

Abstract:The field of Deep Reinforcement Learning (DRL) has recently seen a surge in the popularity of maximum entropy reinforcement learning algorithms. Their popularity stems from the intuitive interpretation of the maximum entropy objective and their superior sample efficiency on standard benchmarks. In this paper, we seek to understand the primary contribution of the entropy term to the performance of maximum entropy algorithms. For the Mujoco benchmark, we demonstrate that the entropy term in Soft Actor-Critic (SAC) principally addresses the bounded nature of the action spaces. With this insight, we propose a simple normalization scheme which allows a streamlined algorithm without entropy maximization match the performance of SAC. Our experimental results demonstrate a need to revisit the benefits of entropy regularization in DRL. We also propose a simple non-uniform sampling method for selecting transitions from the replay buffer during training. We further show that the streamlined algorithm with the simple non-uniform sampling scheme outperforms SAC and achieves state-of-the-art performance on challenging continuous control tasks.

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning

Paper and Code