Skip to yearly menu bar Skip to main content


Poster

Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning

David Yunis · Justin Jung · Falcon Dai · Matthew Walter

[ ]
Thu 12 Dec 11 a.m. PST — 2 p.m. PST

Abstract: Exploration in sparse-reward reinforcement learning (RL) is difficult due to the need for long, coordinated sequences of actions in order to achieve any reward. Skill learning, from demonstrations or interaction, is a promising approach to address this, but skill extraction and inference are expensive for current methods. We present a novel method to extract skills from demonstrations for use in sparse-reward RL, inspired by the popular Byte-Pair Encoding (BPE) algorithm in natural language processing. With these skills, we show strong performance in a variety of tasks, 1000$\times$ acceleration for skill-extraction and 100$\times$ acceleration for policy inference. Given the simplicity of our method, skills extracted from 1\% of the demonstrations in one task can be transferred to a new loosely related task. We also note that such a method yields a finite set of interpretable behaviors.

Live content is unavailable. Log in and register to view live content