Skip to yearly menu bar Skip to main content


Invited Talk
in
Workshop: eXplainable AI approaches for debugging and diagnosis

[IT4] Detecting model reliance on spurious signals is challenging for post hoc explanation approaches

Julius Adebayo


Abstract:

Ascertaining that a deep network does not rely on an unknown spurious signal as basis for its output, prior to deployment, is crucial in high stakes settings like healthcare. While many post hoc explanation methods have been shown to be useful for some end tasks, recent theoretical and empirical evidence suggests that these methods may not be faithful or useful. This leaves little guidance for a practitioner or a researcher using these methods in their decision process. In this talk, we will consider three classes of post hoc explanations--feature attribution, concept activation, and training point ranking--; and ask whether these approaches can alert a practitioner as to a model's reliance on unknown spurious training signals.