NeurIPS Feature Responsiveness Scores: Model-Agnostic Explanations for Agency

Poster
in
Workshop: Attributing Model Behavior at Scale (ATTRIB)

Feature Responsiveness Scores: Model-Agnostic Explanations for Agency

Seung Hyun Cheon · Anneke Wernerfelt · Sorelle Friedler · Berk Ustun

[ Abstract ]

Abstract:

Government regulations now mandate that individuals who are adversely affected by automated systems receive some form of explanation regarding these decisions. In applications where these decisions are based on machine learning models, the standard approach is to explain predictions using post-hoc feature attribution methods. In this work, we show how these methods fall short of fulfilling their intended goals vis-à-vis consumer protection—specifically with respect to improving their chances of achieving desired outcomes. Furthermore, they can induce harm by giving reasons without recourse—providing explanations for individuals with fixed predictions. We propose to address these shortcomings using feature responsiveness scores and develop a versatile approach to construct these scores for any model, which can be swapped in place of existing methods with minimal friction. We run experiments to study the responsiveness of explanations for classification models in consumer finance, a sector with existing and enforced legislation. Our results reveal how common approaches to comply with existing legislation can mislead individuals, underscoring the need for an alternative approach. The responsiveness score consistently returns features that can lead to recourse and flags potential instances of harm.

Chat is not available.

Poster in Workshop: Attributing Model Behavior at Scale (ATTRIB)

Feature Responsiveness Scores: Model-Agnostic Explanations for Agency

Seung Hyun Cheon · Anneke Wernerfelt · Sorelle Friedler · Berk Ustun

Poster
in
Workshop: Attributing Model Behavior at Scale (ATTRIB)