Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Han Long

Cooperation and Competition: Flocking with Evolutionary Multi-Agent Reinforcement Learning

Sep 13, 2022

Yunxiao Guo, Xinjia Xie, Runhao Zhao, Chenglan Zhu, Jiangting Yin, Han Long

Figure 1 for Cooperation and Competition: Flocking with Evolutionary Multi-Agent Reinforcement Learning

Figure 2 for Cooperation and Competition: Flocking with Evolutionary Multi-Agent Reinforcement Learning

Figure 3 for Cooperation and Competition: Flocking with Evolutionary Multi-Agent Reinforcement Learning

Figure 4 for Cooperation and Competition: Flocking with Evolutionary Multi-Agent Reinforcement Learning

Abstract:Flocking is a very challenging problem in a multi-agent system; traditional flocking methods also require complete knowledge of the environment and a precise model for control. In this paper, we propose Evolutionary Multi-Agent Reinforcement Learning (EMARL) in flocking tasks, a hybrid algorithm that combines cooperation and competition with little prior knowledge. As for cooperation, we design the agents' reward for flocking tasks according to the boids model. While for competition, agents with high fitness are designed as senior agents, and those with low fitness are designed as junior, letting junior agents inherit the parameters of senior agents stochastically. To intensify competition, we also design an evolutionary selection mechanism that shows effectiveness on credit assignment in flocking tasks. Experimental results in a range of challenging and self-contrast benchmarks demonstrate that EMARL significantly outperforms the full competition or cooperation methods.

* We misplaced Fig.5 (b) on Page 11 ( This figure is from early experiments with poor results). We failed to resubmit, so we want to revise the whole paper by this chance

Via

Access Paper or Ask Questions

CIM-PPO:Proximal Policy Optimization with Liu-Correntropy Induced Metric

Oct 20, 2021

Yunxiao Guo, Han Long, Xiaojun Duan, Kaiyuan Feng, Maochu Li, Xiaying Ma

Figure 1 for CIM-PPO:Proximal Policy Optimization with Liu-Correntropy Induced Metric

Figure 2 for CIM-PPO:Proximal Policy Optimization with Liu-Correntropy Induced Metric

Figure 3 for CIM-PPO:Proximal Policy Optimization with Liu-Correntropy Induced Metric

Figure 4 for CIM-PPO:Proximal Policy Optimization with Liu-Correntropy Induced Metric

Abstract:As an algorithm based on deep reinforcement learning, Proximal Policy Optimization (PPO) performs well in many complex tasks and has become one of the most popular RL algorithms in recent years. According to the mechanism of penalty in surrogate objective, PPO can be divided into PPO with KL Divergence (KL-PPO) and PPO with Clip function(Clip-PPO). Clip-PPO is widely used in a variety of practical scenarios and has attracted the attention of many researchers. Therefore, many variations have also been created, making the algorithm better and better. However, as a more theoretical algorithm, KL-PPO was neglected because its performance was not as good as CliP-PPO. In this article, we analyze the asymmetry effect of KL divergence on PPO's objective function , and give the inequality that can indicate when the asymmetry will affect the efficiency of KL-PPO. Proposed PPO with Correntropy Induced Metric algorithm(CIM-PPO) that use the theory of correntropy(a symmetry metric method that was widely used in M-estimation to evaluate two distributions' difference)and applied it in PPO. Then, we designed experiments based on OpenAIgym to test the effectiveness of the new algorithm and compare it with KL-PPO and CliP-PPO.

Via

Access Paper or Ask Questions