Skip to yearly menu bar Skip to main content


Keynote Talk
in
Workshop: 3rd Workshop on New Frontiers in Adversarial Machine Learning (AdvML-Frontiers)

Alina Oprea

[ ]
Sat 14 Dec 9:10 a.m. PST — 9:40 a.m. PST

Abstract:

Recent advances in Reinforcement learning (RL) have demonstrated how to identify optimal strategies in critical applications such as medical robots, self-driving cars, cyber security defense, and safety alignment in large language models. These applications require strong guarantees on the integrity of the RL methods used for training decision-making agents. In this talk, I will present two recent papers addressing backdoor poisoning attacks on reinforcement learning during the training phase. Both attacks use a novel threat model and significantly improve upon prior RL attacks in the literature in terms of attack success at low poisoning rates, while maintaining high episodic return. Additionally, I will discuss the challenges of designing RL algorithms that are resilient against poisoning attacks.

Chat is not available.