Poster
in
Workshop: Distribution shifts: connecting methods and applications (DistShift)
Distribution Preserving Bayesian Coresets using Set Constraints
Shovik Guha · Rajiv Khanna · Sanmi Koyejo
Bayesian coresets have become of increasing interest recently for providing a theoretically sound, scalable approach to Bayesian inference. In brief, a coreset is a (weighted) subsample sample of a dataset that approximates the original dataset under some metric. Bayesian coresets specifically focus on approximations that approximate the posterior distribution. Unfortunately, existing Bayesian coreset approaches can significantly undersample minority subpopulations, leading to a lack of distributional robustness. As a remedy, this work extends existing Bayesian coresets from enforcing sparsity constraints to group-wise sparsity constraints. We explore how this approach helps to mitigate distributional vulnerability. We further generalize the group constraints to Bayesian coresets with matroid constraints, which may be of independent interest. We present an optimization analysis of the proposed approach, along with an empirical evaluation on benchmark datasets that support our claims.