NeurIPS Poster Reinforcement Learning Under Latent Dynamics: Toward Statistical and Algorithmic Modularity

Oral Poster

Reinforcement Learning Under Latent Dynamics: Toward Statistical and Algorithmic Modularity

Philip Amortila · Dylan J Foster · Nan Jiang · Akshay Krishnamurthy · Zak Mhammedi

West Ballroom A-D #6410

[ Abstract ]

[ Paper] [ OpenReview]

Wed 11 Dec 11 a.m. PST — 2 p.m. PST

Oral presentation: Oral Session 1C: Optimization and Learning Theory
Wed 11 Dec 10 a.m. PST — 11 a.m. PST

Abstract:

Real-world applications of reinforcement learning often involve environments where agents operate on complex, high-dimensional observations, but the underlying (``latent'') dynamics are comparatively simple. However, beyond restrictive settings such as tabular latent dynamics, the fundamental statistical requirements and algorithmic principles for reinforcement learning under latent dynamics are poorly understood. This paper addresses the question of reinforcement learning under general latent dynamics from a statistical and algorithmic perspective. On the statistical side, our main negativeresult shows that most well-studied settings for reinforcement learning with function approximation become intractable when composed with rich observations; we complement this with a positive result, identifying latent pushforward coverability as ageneral condition that enables statistical tractability. Algorithmically, we develop provably efficient observable-to-latent reductions ---that is, reductions that transform an arbitrary algorithm for the latent MDP into an algorithm that can operate on rich observations--- in two settings: one where the agent has access to hindsightobservations of the latent dynamics (Lee et al., 2023) and onewhere the agent can estimate self-predictive latent models (Schwarzer et al., 2020). Together, our results serve as a first step toward a unified statistical and algorithmic theory forreinforcement learning under latent dynamics.

Chat is not available.