Skip to yearly menu bar Skip to main content


Keynote Talk
in
Workshop: 3rd Workshop on New Frontiers in Adversarial Machine Learning (AdvML-Frontiers)

Hoda Heidari

[ ]
Sat 14 Dec 2:30 p.m. PST — 3 p.m. PST

Abstract:

In response to rising concerns surrounding the safety, security, and trustworthiness of Generative AI (GenAI) models, practitioners, and regulators alike have pointed to AI red-teaming as a key component of their strategies for identifying and mitigating these risks. However, significant questions remain about what precisely AI red-teaming entails, what role it can play in risk identification and evaluation, and how it should be conducted to ensure valid, reliable, and actionable results. I will provide an overview of our recent work which analyzes recent cases of red-teaming activities in the AI industry and contrast it with an extensive survey of the relevant research literature to characterize the scope, structure, and criteria for AI red-teaming practices. I will situate our findings in the broader discussions surrounding the evaluation of GenAI and AI governance, and propose an agenda for future work.

Chat is not available.