Title: Validation with Large Generative Models: A Need for Human-Centric Approaches
Abstract: Especially in applications such as health, we really want to know whether or not our models will behave as we want them to. And for smaller-surface models, including deep generative ones, we have a number of statistical and human-centered techniques to gain confidence that these models are doing largely reasonable things. However, these techniques, already partial for smaller-surface models, are able to provide even fewer assurances in the context of larger-surface models. In this talk, I will discuss how we must fundamentally re-think our approach to validation for larger-surface models. In particular, much of the validation effort must shift from statistical checks done in advance to human-centered checks for a particular output at task-time. I will discuss how this effort will require new methods and lay out some open questions and directions in this space.