NeurIPS Poster Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity

Poster

Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity

Guhao Feng · Han Zhong

West Ballroom A-D #6406

[ Abstract ]

[ Paper] [ OpenReview]

Wed 11 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract:

Reinforcement Learning (RL) encompasses diverse paradigms, including model-based RL, policy-based RL, and value-based RL, each tailored to approximate the model, optimal policy, and optimal value function, respectively. This work investigates the potential hierarchy of representation complexity among these RL paradigms. By utilizing computational complexity measures, including time complexity and circuit complexity, we theoretically unveil a potential representation complexity hierarchy within RL. We find that representing the model emerges as the easiest task, followed by the optimal policy, while representing the optimal value function presents the most intricate challenge. Additionally, we reaffirm this hierarchy from the perspective of the expressiveness of Multi-Layer Perceptrons (MLPs), which align more closely with practical deep RL and contribute to a completely new perspective in theoretical studying representation complexity in RL. Finally, we conduct deep RL experiments to validate our theoretical findings.

Chat is not available.