Skip to yearly menu bar Skip to main content


Poster

Compositional Automata Embeddings for Goal-Conditioned Reinforcement Learning

Beyazit Yalcinkaya · Niklas Lauffer · Marcell Vazquez-Chanlatte · Sanjit Seshia

[ ]
Wed 11 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

Goal-conditioned reinforcement learning is a powerful way to control an AI agent's behavior at runtime. That said, popular goal representations, e.g., target states or natural language, are either limited to Markovian tasks or rely on ambiguous task semantics. We propose representing temporal goals using compositions of deterministic finite automata (cDFA). cDFAs balance the need for formal temporal semantics with ease of interpretation--if one can understand a flow chart, one can understand a cDFA. On the other hand, cDFA are a countably infinite concept class with Boolean semantics, and subtle changes to the automata can result in very different agent behavior. To address this, we observe that all paths through a DFA correspond to a series of reach-avoid tasks. Based on this, we propose pre-training graph neural network embedding on "reach-avoid derived" DFAs. Empirically, we demonstrate that the proposed pre-training method enables zero-shot generalization to various cDFA task classes and accelerated policy specialization.

Live content is unavailable. Log in and register to view live content