Poster
in
Workshop: Workshop on robustness of zero/few-shot learning in foundation models (R0-FoMo)
Selective Prediction For Open-Ended Question Answering in Black-Box Vision-Language Models
Zaid Khan · Yun Fu
When mistakes have serious consequences, reliable use of a model requires understanding when the predictions of the model are trustworthy. One approach is selective prediction, in which a model is allowed to abstain if it is uncertain. Existing methods for selective prediction require access to model internals, retraining, or large number of model evaluations, and cannot be used for black box models available only through an API. This is a barrier to the use of powerful commercial foundation models in risk-sensitive applications. Furthermore, existing work has largely focused on unimodal foundation models. We propose a method to improve selective prediction in a black box vision-language model by measuring consistency over the neighbors of a visual question. Although direct sampling of the neighborhood is not possible, we propose using a probing model as a proxy. We describe experiments testing the proposed method on in-distribution, out-of-distribution and adversarial questions. We find that the consistency of a vision-language model across rephrasings of a visual question can be used to identify and reject high-risk visual questions, even in out-of-distribution and adversarial settings, constituting a step towards safe use of black-box vision-language models.