Poster
in
Workshop: 5th Workshop on Meta-Learning
On the Practical Consistency of Meta-Reinforcement Learning Algorithms
Zheng Xiong · Luisa Zintgraf · Jacob Beck · Risto Vuorio · Shimon Whiteson
Consistency, the theoretical property of a meta learning algorithm of being able to adapt to any task at test time under its default settings (and various assumptions), has been frequently named as desirable in the literature. An open question is whether and how theoretical consistency translates into practice, in comparison to inconsistent algorithms. In this paper, we empirically investigate this question on a set of representative meta-RL algorithms. We find that usually, theoretically consistent algorithms can indeed adapt to out-of-distribution (OOD) tasks, while inconsistent ones cannot, although they can still fail in practice due to reasons like poor exploration. We further find that theoretically inconsistent algorithms can be made consistent by continuing to train on the OOD tasks, and adapt as well or better than consistent ones. We conclude that theoretical consistency is indeed a desirable property, albeit not as advantageous in practice as often assumed.