Contributed Talk
in
Workshop: Generalization in Planning (GenPlan '23)
Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary Stochastic Settings
Rushang Karia · Pulkit Verma · Gaurav Vipat · Siddharth Srivastava
Keywords: [ Model-based Reinforcement Learning ] [ Relational Reinforcement Learning ] [ Non-stationary model-learning ] [ Stochastic planning ]
Reinforcement Learning (RL) provides a convenient framework for sequential decision making when closed-form transition dynamics are unavailable and can frequently change. However, the high sample complexity of RL approaches limits their utility in the real-world. This paper presents an approach that performs meta-level exploration in the space of models and uses the learned models to compute policies. Our approach interleaves learning and planning allowing data-efficient, task-focused sample collection in the presence of non-stationarity. We conduct an empirical evaluation on benchmark domains and show that our approach significantly outperforms baselines in sample complexity and easily adapts to changing transition systems across tasks.