Poster
in
Workshop: Algorithmic Fairness through the lens of Metrics and Evaluation
Fairness-Enhancing Data Augmentation Methods for Worst-Group Accuracy
Monica Welfert · Nathan Stromberg · Lalitha Sankar
Keywords: [ Guarantees ] [ Evaluation Methods and Techniques ]
Sat 14 Dec 9 a.m. PST — 5:30 p.m. PST
Ensuring fair predictions across many distinct subpopulations in the training data can be prohibitive for large models. Recently, simple linear last layer retraining strategies, in combination with data augmentation methods such as upweighting and downsampling have been shown to achieve state-of-the-art performance for worst-group accuracy, which quantifies accuracy for the least prevalent subpopulation. For linear last layer retraining and the abovementioned augmentations, we present a comparison of the optimal worst-group accuracy when modeling the distribution of the latent representations (input to the last layer) as Gaussian for each subpopulation. Observing that these augmentation techniques rely heavily on well-labeled subpopulations, we present a comparison of the optimal worst-group accuracy in the setting of label noise. We verify our results for both synthetic and large publicly available datasets.