NeurIPS Multimodal Situational Safety

Oral
in
Workshop: Workshop on Responsibly Building Next Generation of Multimodal Foundation Models

Multimodal Situational Safety

Kaiwen Zhou · Chengzhi Liu · Xuandong Zhao · Anderson Compalas · Xin Eric Wang

Keywords: [ Benchmark ] [ Multimodal Large Language Model ] [ Multimodal Situational Safety ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Sat 14 Dec 11:30 a.m. PST — 11:45 a.m. PST

Abstract:

Multimodal Large Language Models (MLLMs) have emerged as powerful multimodal assistants, capable of interacting with humans and their environments using language and actions. However, these advancements also introduce new safety challenges: whether a query from the user has unsafe intent depends on the situation they are in. To address this, we introduce the problem of Multimodal Situational Safety, where the model needs to judge the safety implications of a language query based on the visual context. Based on this problem, we collect a benchmark comprising 1840 language queries, where each query is paired with one safe image context and one unsafe image context. Our evaluation shows that current MLLMs struggle with this nuanced safety problem. Moreover, to diagnose the impact of different abilities of MLLMs on their safety performance, such as explicit safety reasoning, visual understanding, and situation safety reasoning, we create different evaluation setting variants. Given the diagnosis results, we propose a multi-step safety-examination method to mitigate such attacks and offer insights for future enhancement.

Chat is not available.

Oral in Workshop: Workshop on Responsibly Building Next Generation of Multimodal Foundation Models

Multimodal Situational Safety

Kaiwen Zhou · Chengzhi Liu · Xuandong Zhao · Anderson Compalas · Xin Eric Wang

Oral
in
Workshop: Workshop on Responsibly Building Next Generation of Multimodal Foundation Models