NeurIPS Hoda Heidari

Keynote Talk
in
Workshop: 3rd Workshop on New Frontiers in Adversarial Machine Learning (AdvML-Frontiers)

Hoda Heidari

[ Abstract ]

Sat 14 Dec 2:30 p.m. PST — 3 p.m. PST

Abstract:

In response to rising concerns surrounding the safety, security, and trustworthiness of Generative AI (GenAI) models, practitioners, and regulators alike have pointed to AI red-teaming as a key component of their strategies for identifying and mitigating these risks. However, significant questions remain about what precisely AI red-teaming entails, what role it can play in risk identification and evaluation, and how it should be conducted to ensure valid, reliable, and actionable results. I will provide an overview of our recent work which analyzes recent cases of red-teaming activities in the AI industry and contrast it with an extensive survey of the relevant research literature to characterize the scope, structure, and criteria for AI red-teaming practices. I will situate our findings in the broader discussions surrounding the evaluation of GenAI and AI governance, and propose an agenda for future work.

Chat is not available.

Keynote Talk in Workshop: 3rd Workshop on New Frontiers in Adversarial Machine Learning (AdvML-Frontiers)

Hoda Heidari

Keynote Talk
in
Workshop: 3rd Workshop on New Frontiers in Adversarial Machine Learning (AdvML-Frontiers)