Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Agent Learning in Open-Endedness Workshop

minimax: Efficient Baselines for Autocurricula in JAX

Minqi Jiang · Michael Dennis · Edward Grefenstette · Tim Rocktäschel

Keywords: [ baselines ] [ autocurricula ] [ Benchmarks ] [ Reinforcement Learning ] [ environment design ] [ curriculum learning ] [ jax ]


Abstract: Unsupervised environment design (UED) is a form of automatic curriculum learning for training robust decision-making agents in zero-shot transfer to unseen environments. Such autocurricula methods have gathered much interest from the RL community. However, UED experiments often require up to several weeks of training on standard RL architectures based on CPU rollouts and GPU model updates. This compute requirement is a major obstacle that prevents rapid innovation of UED methods. This work introduces minimax, a library that conducts UED training completely on accelerated hardware. To achieve this feat, minimax takes advantage of the JAX library and ports previous UED environments into fully-tensorized implementations, allowing the entire training loop to be compiled for hardware acceleration. As a petri dish for rapid experimentation, minimax includes a vectorized version of a grid-world based on MiniGrid, in addition to many reusable abstractions for conducting autocurricula in underspecified, procedurally generated environments. With these components, minimax provides strong UED baselines, including new parallelized variants, that achieve over 50$\times$ speed ups in wall time compared to previous implementations.

Chat is not available.