Poster
C-GAIL: Stabilizing Generative Adversarial Imitation Learning with Control Theory
Tianjiao Luo · Tim Pearce · Huayu Chen · Jianfei Chen · Jun Zhu
Generative Adversarial Imitation Learning (GAIL) provides a promising approach to training a generative policy to imitate a demonstrator. It uses on-policy Reinforcement Learning (RL) to optimize a reward signal derived from an adversarial discriminator. However, optimizing GAIL is difficult in practise, with the training loss oscillating during training, slowing convergence. This optimization instability can prevent GAIL from finding a good policy, harming its final performance. In this paper, we study GAIL’s optimization from a control-theoretic perspective. We show that GAIL cannot converge to the desired equilibrium. In response, we analyze the training dynamics of GAIL in function space and design a novel controller that not only pushes GAIL to the desired equilibrium but also achieves asymptotic stability in a simplified “one-step” setting. Going from theory to practice, we propose Controlled-GAIL (C-GAIL), which adds a differentiable regularization term on the GAIL objective to stabilize training. Empirically, the C-GAIL regularizer improves the training of various existing GAIL methods, including the popular GAIL-DAC, by speeding up the convergence, reducing the range of oscillation, and matching the expert distribution more closely.
Live content is unavailable. Log in and register to view live content