Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Workshop on Behavioral Machine Learning

Learning to Cooperate with Humans using Generative Agents

Yancheng Liang · Daphne Chen · Abhishek Gupta · Simon Du · Natasha Jaques


Abstract:

Training agents that can coordinate zero-shot with humans is a key mission in multi-agent reinforcement learning (MARL). Current algorithms focus on training simulated human partner policies which are then used to train a Cooperator agent. The simulated human is produced either through behavior cloning over a human dataset, or by using MARL to create a population of simulated agents. However, these approaches often struggle to produce an effective Cooperator since the simulated humans fail to cover the diverse strategies employed by people in the real world. We show that \emph{learning a generative model of human partners} can effectively address this issue. Our model learns a latent variable representation of the human that can be regarded as encoding the human's unique strategy, intention, experience, or style. This generative model can be flexibly trained from any (human or neural policy) agent interaction data, unifying approaches proposed in prior work. By sampling from the latent space, we can use the generative model to produce different partners to train Cooperator agents. We evaluate our method---Generative Agent Modeling for Multi-agent Adaptation (GAMMA)---on Overcooked, a challenging cooperative cooking game that has become a standard benchmark for zero-shot coordination. We conduct an evaluation with real human teammates, and the results show that GAMMA consistently improves performance, whether the generative model is trained on simulated populations or human datasets. Further, we propose a method for posterior sampling from the generative model that is biased towards the human data, enabling us to efficiently improve performance with only a small amount of expensive human interaction data.

Chat is not available.