Skip to yearly menu bar Skip to main content


Poster

Nonparametric Bayesian Policy Priors for Reinforcement Learning

Finale P Doshi-Velez · David Wingate · Nicholas Roy · Josh Tenenbaum


Abstract:

We consider reinforcement learning in partially observable domains where the agent can query an expert for demonstrations. Our nonparametric Bayesian approach combines model knowledge, inferred from expert information and independent exploration, with policy knowledge inferred from expert trajectories. We introduce priors that bias the agent towards models with both simple representations and simple policies, resulting in improved policy and model learning.

Live content is unavailable. Log in and register to view live content