Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Algorithmic Fairness through the lens of Metrics and Evaluation

Adaptive Group Robust Ensemble Knowledge Distillation

Patrik Joslin Kenfack · Ulrich Aïvodji · Samira Ebrahimi Kahou

Keywords: [ Bias Mitigation ] [ General Fairness ]

[ ]
[ Poster
 
presentation: Algorithmic Fairness through the lens of Metrics and Evaluation
Sat 14 Dec 9 a.m. PST — 5:30 p.m. PST

Abstract:

Neural networks can learn spurious correlations present in the data, which can often lead to performance disparity for underrepresented subgroups. Studies have demonstrated that the disparity is amplified when knowledge is distilled from a complex teacher model to a relatively ``simple'' student model. Prior work has shown that ensemble deep learning methods can improve the performance of the worst-case subgroups; however, it is unclear if this advantage carries over when distilling knowledge from an ensemble of models (teachers) to a single model, especially when the teacher models are debiased. This study demonstrates that traditional ensemble knowledge distillation can significantly drop performance for the worst-case subgroup when evaluating the student model, even when the teacher models are debiased. To overcome this, we propose an adaptive knowledge ensembling strategy to ensure that the student model receives the knowledge beneficial for underrepresented subgroups. Leveraging an additional biased model, our method exploits the gradient space for updating the student model and upweight teachers whose gradients are in the opposite direction of the biased model. Our experiments on several datasets demonstrate the superiority of the proposed ensemble distillation technique and show that it can even outperform classic model ensembles based on majority voting.

Chat is not available.