Poster
in
Workshop: Regulatable ML: Towards Bridging the Gaps between Machine Learning Research and Regulations
Policy Comparison Under Confounding
Luke Guerdan · Amanda Coston · Steven Wu · Kenneth Holstein
Predictive models are often introduced under the rationale that they improve performance over an existing decision-making policy. However, it is challenging to directly compare an algorithm against a status quo policy due to uncertainty introduced by confounding and selection bias. In this work, we develop a regret estimator which evaluates differences in classification metrics across decision-making policies under confounding. Theoretical and experimental results demonstrate that our regret estimator yields tighter regret bounds than existing auditing frameworks designed to evaluate predictive models under confounding. Further, we show that our regret estimator can be combined with a flexible set of causal identification strategies to yield informative and well-justified policy comparisons. Our experimental results also illustrate how confounding and selection bias contribute to uncertainty in subgroup-level policy comparisons. We hope that our auditing framework will support the operationalization of regulatory frameworks calling for more direct assessments of predictive model efficacy.