Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Variance Reduction based Experience Replay for Policy Optimization

Aug 25, 2022

Hua Zheng, Wei Xie, M. Ben Feng

Figure 1 for Variance Reduction based Experience Replay for Policy Optimization

Figure 2 for Variance Reduction based Experience Replay for Policy Optimization

Figure 3 for Variance Reduction based Experience Replay for Policy Optimization

Figure 4 for Variance Reduction based Experience Replay for Policy Optimization

Share this with someone who'll enjoy it:

Abstract:For reinforcement learning on complex stochastic systems where many factors dynamically impact the output trajectories, it is desirable to effectively leverage the information from historical samples collected in previous iterations to accelerate policy optimization. Classical experience replay allows agents to remember by reusing historical observations. However, the uniform reuse strategy that treats all observations equally overlooks the relative importance of different samples. To overcome this limitation, we propose a general variance reduction based experience replay (VRER) framework that can selectively reuse the most relevant samples to improve policy gradient estimation. This selective mechanism can adaptively put more weight on past samples that are more likely to be generated by the current target distribution. Our theoretical and empirical studies show that the proposed VRER can accelerate the learning of optimal policy and enhance the performance of state-of-the-art policy optimization approaches.

* 39 pages, 5 figures. arXiv admin note: text overlap with arXiv:2110.08902

View paper on

Share this with someone who'll enjoy it:

Title:Variance Reduction based Experience Replay for Policy Optimization

Paper and Code