Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiangsen Wang

Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization

Jul 21, 2023

Xiangsen Wang, Haoran Xu, Yinan Zheng, Xianyuan Zhan

Figure 1 for Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization

Figure 2 for Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization

Figure 3 for Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization

Figure 4 for Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization

Abstract:Offline reinforcement learning (RL) has received considerable attention in recent years due to its attractive capability of learning policies from offline datasets without environmental interactions. Despite some success in the single-agent setting, offline multi-agent RL (MARL) remains to be a challenge. The large joint state-action space and the coupled multi-agent behaviors pose extra complexities for offline policy optimization. Most existing offline MARL studies simply apply offline data-related regularizations on individual agents, without fully considering the multi-agent system at the global level. In this work, we present OMIGA, a new offline m ulti-agent RL algorithm with implicit global-to-local v alue regularization. OMIGA provides a principled framework to convert global-level value regularization into equivalent implicit local value regularizations and simultaneously enables in-sample learning, thus elegantly bridging multi-agent value decomposition and policy learning with offline regularizations. Based on comprehensive experiments on the offline multi-agent MuJoCo and StarCraft II micro-management tasks, we show that OMIGA achieves superior performance over the state-of-the-art offline MARL methods in almost all tasks.

Via

Access Paper or Ask Questions

Offline Multi-Agent Reinforcement Learning with Coupled Value Factorization

Jun 15, 2023

Xiangsen Wang, Xianyuan Zhan

Figure 1 for Offline Multi-Agent Reinforcement Learning with Coupled Value Factorization

Figure 2 for Offline Multi-Agent Reinforcement Learning with Coupled Value Factorization

Figure 3 for Offline Multi-Agent Reinforcement Learning with Coupled Value Factorization

Figure 4 for Offline Multi-Agent Reinforcement Learning with Coupled Value Factorization

Abstract:Offline reinforcement learning (RL) that learns policies from offline datasets without environment interaction has received considerable attention in recent years. Compared with the rich literature in the single-agent case, offline multi-agent RL is still a relatively underexplored area. Most existing methods directly apply offline RL ingredients in the multi-agent setting without fully leveraging the decomposable problem structure, leading to less satisfactory performance in complex tasks. We present OMAC, a new offline multi-agent RL algorithm with coupled value factorization. OMAC adopts a coupled value factorization scheme that decomposes the global value function into local and shared components, and also maintains the credit assignment consistency between the state-value and Q-value functions. Moreover, OMAC performs in-sample learning on the decomposed local state-value functions, which implicitly conducts max-Q operation at the local level while avoiding distributional shift caused by evaluating out-of-distribution actions. Based on the comprehensive evaluations of the offline multi-agent StarCraft II micro-management tasks, we demonstrate the superior performance of OMAC over the state-of-the-art offline multi-agent RL methods.

* Accepted by the 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023)

Via

Access Paper or Ask Questions

GHQ: Grouped Hybrid Q Learning for Heterogeneous Cooperative Multi-agent Reinforcement Learning

Mar 02, 2023

Xiaoyang Yu, Youfang Lin, Xiangsen Wang, Sheng Han, Kai Lv

Abstract:Previous deep multi-agent reinforcement learning (MARL) algorithms have achieved impressive results, typically in homogeneous scenarios. However, heterogeneous scenarios are also very common and usually harder to solve. In this paper, we mainly discuss cooperative heterogeneous MARL problems in Starcraft Multi-Agent Challenges (SMAC) environment. We firstly define and describe the heterogeneous problems in SMAC. In order to comprehensively reveal and study the problem, we make new maps added to the original SMAC maps. We find that baseline algorithms fail to perform well in those heterogeneous maps. To address this issue, we propose the Grouped Individual-Global-Max Consistency (GIGM) and a novel MARL algorithm, Grouped Hybrid Q Learning (GHQ). GHQ separates agents into several groups and keeps individual parameters for each group, along with a novel hybrid structure for factorization. To enhance coordination between groups, we maximize the Inter-group Mutual Information (IGMI) between groups' trajectories. Experiments on original and new heterogeneous maps show the fabulous performance of GHQ compared to other state-of-the-art algorithms.

Via

Access Paper or Ask Questions