Skip to yearly menu bar Skip to main content


Poster

Aggregating Optimistic Planning Trees for Solving Markov Decision Processes

Gunnar Kedenburg · Raphael Fonteneau · Remi Munos

Harrah's Special Events Center, 2nd Floor

Abstract:

This paper addresses the problem of online planning in Markov Decision Processes using only a generative model. We propose a new algorithm which is based on the construction of a forest of single successor state planning trees. For every explored state-action, such a tree contains exactly one successor state, drawn from the generative model. The trees are built using a planning algorithm which follows the optimism in the face of uncertainty principle, in assuming the most favorable outcome in the absence of further information. In the decision making step of the algorithm, the individual trees are combined. We discuss the approach, prove that our proposed algorithm is consistent, and empirically show that it performs better than a related algorithm which additionally assumes the knowledge of all transition distributions.

Live content is unavailable. Log in and register to view live content