Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Reinforcement learning with experience replay and adaptation of action dispersion

Jul 30, 2022

Paweł Wawrzyński, Wojciech Masarczyk, Mateusz Ostaszewski

Figure 1 for Reinforcement learning with experience replay and adaptation of action dispersion

Figure 2 for Reinforcement learning with experience replay and adaptation of action dispersion

Figure 3 for Reinforcement learning with experience replay and adaptation of action dispersion

Figure 4 for Reinforcement learning with experience replay and adaptation of action dispersion

Share this with someone who'll enjoy it:

Abstract:Effective reinforcement learning requires a proper balance of exploration and exploitation defined by the dispersion of action distribution. However, this balance depends on the task, the current stage of the learning process, and the current environment state. Existing methods that designate the action distribution dispersion require problem-dependent hyperparameters. In this paper, we propose to automatically designate the action distribution dispersion using the following principle: This distribution should have sufficient dispersion to enable the evaluation of future policies. To that end, the dispersion should be tuned to assure a sufficiently high probability (densities) of the actions in the replay buffer and the modes of the distributions that generated them, yet this dispersion should not be higher. This way, a policy can be effectively evaluated based on the actions in the buffer, but exploratory randomness in actions decreases when this policy converges. The above principle is verified here on challenging benchmarks Ant, HalfCheetah, Hopper, and Walker2D, with good results. Our method makes the action standard deviations converge to values similar to those resulting from trial-and-error optimization.

View paper on

Share this with someone who'll enjoy it:

Title:Reinforcement learning with experience replay and adaptation of action dispersion

Paper and Code