Workshop
Red Teaming GenAI: What Can We Learn from Adversaries?
Valeriia Cherepanova · Bo Li · Niv Cohen · Yifei Wang · Yisen Wang · Avital Shafran · Nil-Jana Akpinar · James Zou
West Meeting Room 301
Sun 15 Dec, 8:15 a.m. PST
The development and proliferation of modern generative AI models has introduced valuable capabilities, but these models and their applications also introduce risks to human safety. How do we identify risks in new systems before they cause harm during deployment? This workshop focuses on red teaming, an emergent adversarial approach to probing model behaviors, and its applications towards making modern generative AI safe for humans.
Live content is unavailable. Log in and register to view live content
Timezone: America/Los_Angeles