Poster
in
Workshop: Workshop on Responsibly Building Next Generation of Multimodal Foundation Models
Adversarial Robust Deep Reinforcement Learning is Neither Robust Nor Safe
Ezgi Korkmaz
Keywords: [ deep reinforcement learning ] [ robustness ]
The policies trained with deep reinforcement learning are being deployed in many different settings from automated language assistants to biomedical applications. Yet concerns have been raised regarding robustness and safety of deep reinforcement learning policies. To target these problems several works focused on proposing adversarial training methods for deep reinforcement learning and claimed adversarial training achieves safe and robust deep reinforcement learning policies. In this paper, we demonstrate that adversarial deep reinforcement learning is neither safe nor is it robust. While, robust deep reinforcement learning policies can be attacked via black-box adversarial perturbations, our results further demonstrate that standard reinforcement learning policies are more robust compared to robust deep reinforcement learning under natural attacks. Furthermore, this paper highlights that robust deep reinforcement learning policies cannot generalize even in the same level with standard reinforcement learning.