Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination

Dec 22, 2021

Rui Zhao, Jinming Song, Hu Haifeng, Yang Gao, Yi Wu, Zhongqian Sun, Yang Wei

Figure 1 for Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination

Figure 2 for Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination

Figure 3 for Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination

Figure 4 for Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination

Share this with someone who'll enjoy it:

Abstract:An AI agent should be able to coordinate with humans to solve tasks. We consider the problem of training a Reinforcement Learning (RL) agent without using any human data, i.e., in a zero-shot setting, to make it capable of collaborating with humans. Standard RL agents learn through self-play. Unfortunately, these agents only know how to collaborate with themselves and normally do not perform well with unseen partners, such as humans. The methodology of how to train a robust agent in a zero-shot fashion is still subject to research. Motivated from the maximum entropy RL, we derive a centralized population entropy objective to facilitate learning of a diverse population of agents, which is later used to train a robust agent to collaborate with unseen partners. The proposed method shows its effectiveness compared to baseline methods, including self-play PPO, the standard Population-Based Training (PBT), and trajectory diversity-based PBT, in the popular Overcooked game environment. We also conduct online experiments with real humans and further demonstrate the efficacy of the method in the real world. A supplementary video showing experimental results is available at https://youtu.be/Xh-FKD0AAKE.

* Accepted by NeurIPS Cooperative AI Workshop, 2021, link: https://www.cooperativeai.com/neurips-2021/workshop-papers

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination

Paper and Code