Poster
in
Workshop: Intrinsically Motivated Open-ended Learning (IMOL)
InfiniteKitchen: Cross-environment Cooperation for Zero-shot Multi-agent Coordination
Kunal Jha · Natasha Jaques · Max Kleiman-Weiner
Keywords: [ procedural environment generation ] [ multi-agent interactions ] [ zero-shot coordination ]
Zero-shot coordination (ZSC) is an important challenge for developing adaptable AI systems that are capable of collaborating with humans in unfamiliar tasks. While prior work has mainly focused on adapting to new partners \citep{hu2021offbelieflearning, strouse2022collaboratinghumanshumandata}, generalizing cooperation across different environments is equally important. This paper investigates training AI agents in self-play (SP) to achieve zero-shot collaboration with novel partners in novel tasks. We introduce a new Jax based, procedurally generated environment for multi-agent reinforcement learning, Infinite Kitchen. Our rule-based generator creates billions of solvable kitchen configurations that enable the training of a single, generalizable agent that can adapt to new levels. Our results show that exposure to diverse levels in self-play consistently improves generalization to new partners, with graph neural network (GNN) based architectures achieving the highest performance across many layouts. Our findings suggest that learning to collaborate across a multitude of unique scenarios encourages agents to develop maximally general norms, which prove highly effective for collaboration with different partners when combined with appropriate inductive biases.