Poster
in
Workshop: Statistical Frontiers in LLMs and Foundation Models
Adversarial Robust Deep Reinforcement Learning is Neither Robust Nor Safe
Ezgi Korkmaz
Keywords: [ reinforcement learning ]
Learning from raw high-dimensional observations became possible with the help of deep neural networks. With this initial enhancement the progress of reinforcement learning research is experiencing one of its highest peaks. The policies trained with deep reinforcement learning are being deployed in many different settings from medical to industrial control. Yet concerns have been raised regarding robustness and safety of deep reinforcement learning policies. To target these problems several works focused on proposing adversarial training methods for deep reinforcement learning and claimed adversarial training achieves safe and robust deep reinforcement learning policies. In this paper, we demonstrate that adversarial deep reinforcement learning is neither safe nor is it robust. While, robust deep reinforcement learning policies can be attacked via black-box adversarial perturbations, our results further demonstrate that standard reinforcement learning policies are more robust compared to robust deep reinforcement learning under natural attacks. Furthermore, this paper highlights that robust deep reinforcement learning policies cannot generalize even in the same level with standard reinforcement learning.