Poster
Minimizing Hyperspherical Energy for Diverse Deep Ensembling
David Smerkous · Qinxun Bai · Fuxin Li
Particle-based Bayesian deep learning often requires a metric to compare the similarity between two networks. However, naive similarity metrics lack permutation invariance, and are inappropriate to compare networks. Centered Kernel Alignment (CKA) on feature kernels have been proposed to compare deep networks, but it has not been used as an optimization objective in Bayesian deep learning. This paper explores the use of CKA in Bayesian deep learning to generate diverse ensembles and hypernetworks that output a network posterior. Noting that CKA projects kernels onto a unit hypersphere, we propose to adopt the approach of minimum hyperspherical energy (MHE) on top of CKA kernels to improve the stability of learning. Experiments on both diverse ensembles and hypernetworks showed that our approach significantly outperformed baselines in terms of uncertainty quantification in both synthetic and realistic outlier detection tasks.
Live content is unavailable. Log in and register to view live content