Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Distributed Policy Gradient with Variance Reduction in Multi-Agent Reinforcement Learning

Nov 30, 2021

Xiaoxiao Zhao, Jinlong Lei, Li Li

Figure 1 for Distributed Policy Gradient with Variance Reduction in Multi-Agent Reinforcement Learning

Figure 2 for Distributed Policy Gradient with Variance Reduction in Multi-Agent Reinforcement Learning

Figure 3 for Distributed Policy Gradient with Variance Reduction in Multi-Agent Reinforcement Learning

Figure 4 for Distributed Policy Gradient with Variance Reduction in Multi-Agent Reinforcement Learning

Share this with someone who'll enjoy it:

Abstract:This paper studies a distributed policy gradient in collaborative multi-agent reinforcement learning (MARL), where agents over a communication network aim to find the optimal policy to maximize the average of all agents' local returns. Due to the non-concave performance function of policy gradient, the existing distributed stochastic optimization methods for convex problems cannot be directly used for policy gradient in MARL. This paper proposes a distributed policy gradient with variance reduction and gradient tracking to address the high variances of policy gradient, and utilizes importance weight to solve the non-stationary problem in the sampling process. We then provide an upper bound on the mean-squared stationary gap, which depends on the number of iterations, the mini-batch size, the epoch size, the problem parameters, and the network topology. We further establish the sample and communication complexity to obtain an $\epsilon$-approximate stationary point. Numerical experiments on the control problem in MARL are performed to validate the effectiveness of the proposed algorithm.

View paper on

Share this with someone who'll enjoy it:

Title:Distributed Policy Gradient with Variance Reduction in Multi-Agent Reinforcement Learning

Paper and Code