Poster
in
Workshop: 6th Robot Learning Workshop: Pretraining, Fine-Tuning, and Generalization with Large Scale Models
Hybrid Inverse Reinforcement Learning
Juntao Ren · Gokul Swamy · Steven Wu · J. Bagnell · Sanjiban Choudhury
Keywords: [ inverse reinforcement learning ] [ imitation learning ]
The inverse reinforcement learning approach to imitation learning is a double-edged sword. On one hand, it allows the learner to find policies that are robust to compounding errors. On the other hand, it requires that the learner repeatedly solve a computationally expensive reinforcement learning (RL) problem. Often, much of this computation is spent exploring parts of the state space the expert never visited and is therefore wasted. In this work, we propose using hybrid reinforcement learning to curtail this unnecessary exploration. More formally, we derive a reduction from inverse RL to hybrid RL that allows us to dramatically reduce interaction during the inner policy search loop while still maintaining a degree of robustness to compounding errors. Empirically, we find that our approaches are far more sample efficient than standard inverse RL and several other baselines that require stronger assumptions on a suite of continuous control tasks.