Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wenze Chen

TiZero: Mastering Multi-Agent Football with Curriculum Learning and Self-Play

Feb 21, 2023

Fanqi Lin, Shiyu Huang, Tim Pearce, Wenze Chen, Wei-Wei Tu

Abstract:Multi-agent football poses an unsolved challenge in AI research. Existing work has focused on tackling simplified scenarios of the game, or else leveraging expert demonstrations. In this paper, we develop a multi-agent system to play the full 11 vs. 11 game mode, without demonstrations. This game mode contains aspects that present major challenges to modern reinforcement learning algorithms; multi-agent coordination, long-term planning, and non-transitivity. To address these challenges, we present TiZero; a self-evolving, multi-agent system that learns from scratch. TiZero introduces several innovations, including adaptive curriculum learning, a novel self-play strategy, and an objective that optimizes the policies of multiple agents jointly. Experimentally, it outperforms previous systems by a large margin on the Google Research Football environment, increasing win rates by over 30%. To demonstrate the generality of TiZero's innovations, they are assessed on several environments beyond football; Overcooked, Multi-agent Particle-Environment, Tic-Tac-Toe and Connect-Four.

* The 22nd International Conference on Autonomous Agents and Multiagent Systems(AAMAS2023)

Via

Access Paper or Ask Questions

DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization

Jul 12, 2022

Wenze Chen, Shiyu Huang, Yuan Chiang, Ting Chen, Jun Zhu

Figure 1 for DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization

Figure 2 for DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization

Figure 3 for DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization

Figure 4 for DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization

Abstract:Recent algorithms designed for reinforcement learning tasks focus on finding a single optimal solution. However, in many practical applications, it is important to develop reasonable agents with diverse strategies. In this paper, we propose Diversity-Guided Policy Optimization (DGPO), an on-policy framework for discovering multiple strategies for the same task. Our algorithm uses diversity objectives to guide a latent code conditioned policy to learn a set of diverse strategies in a single training procedure. Specifically, we formalize our algorithm as the combination of a diversity-constrained optimization problem and an extrinsic-reward constrained optimization problem. And we solve the constrained optimization as a probabilistic inference task and use policy iteration to maximize the derived lower bound. Experimental results show that our method efficiently finds diverse strategies in a wide variety of reinforcement learning tasks. We further show that DGPO achieves a higher diversity score and has similar sample complexity and performance compared to other baselines.

Via

Access Paper or Ask Questions

TiKick: Towards Playing Multi-agent Football Full Games from Single-agent Demonstrations

Oct 19, 2021

Shiyu Huang, Wenze Chen, Longfei Zhang, Ziyang Li, Fengming Zhu, Deheng Ye, Ting Chen, Jun Zhu

Figure 1 for TiKick: Towards Playing Multi-agent Football Full Games from Single-agent Demonstrations

Figure 2 for TiKick: Towards Playing Multi-agent Football Full Games from Single-agent Demonstrations

Figure 3 for TiKick: Towards Playing Multi-agent Football Full Games from Single-agent Demonstrations

Figure 4 for TiKick: Towards Playing Multi-agent Football Full Games from Single-agent Demonstrations

Abstract:Deep reinforcement learning (DRL) has achieved super-human performance on complex video games (e.g., StarCraft II and Dota II). However, current DRL systems still suffer from challenges of multi-agent coordination, sparse rewards, stochastic environments, etc. In seeking to address these challenges, we employ a football video game, e.g., Google Research Football (GRF), as our testbed and develop an end-to-end learning-based AI system (denoted as TiKick) to complete this challenging task. In this work, we first generated a large replay dataset from the self-playing of single-agent experts, which are obtained from league training. We then developed a distributed learning system and new offline algorithms to learn a powerful multi-agent AI from the fixed single-agent dataset. To the best of our knowledge, Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game, while previous work could either control a single agent or experiment on toy academic scenarios. Extensive experiments further show that our pre-trained model can accelerate the training process of the modern multi-agent algorithm and our method achieves state-of-the-art performances on various academic scenarios.

Via

Access Paper or Ask Questions