Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Bayesian Decision-making and Uncertainty: from probabilistic and spatiotemporal modeling to sequential experiment design

Gradient-free variational learning with conditional mixture networks

Conor Heins · Hao Wu · Dimitrije Markovic · Alexander Tschantz · Jeff Beck · Christopher L Buckley

Keywords: [ Mixture-of-Experts ] [ Variational Inference ] [ conjugate-exponential ] [ Mixture Models ] [ gradient-free ] [ Bayesian neural network ] [ Variational Bayes ]


Abstract:

Balancing computational efficiency with robust predictive performance is crucial in supervised learning, especially for safety-critical applications. While deep learning models are accurate and scalable, they often lack calibrated predictions and uncertainty quantification. Bayesian methods address these issues but are often computationally expensive. We introduce CAVI-CMN, a gradient-free, fast variational method for training conditional mixture networks (CMNs), a probabilistic variant of the mixture-of-experts (MoE) model. Using conjugate priors and PĆ³lya-Gamma augmentation, we derive efficient updates via coordinate ascent variational inference (CAVI). We apply this method to train conditional mixture networks on classification tasks from the UCI repository. Our approach, which we name CAVI-CMN, achieves competitive and often superior predictive accuracy compared to backpropagation (i.e., maximum likelihood estimation) while maintaining posterior distributions over model parameters. Moreover, computation time scales in model complexity competitively to both MLE and other gradient-based solutions like black-box variational inference (BBVI), while running overall much faster than BBVI and sampling-based inference and with similar speed to MLE. This combination of probabilistic robustness and computational efficiency positions CAVI-CMN as a building block for constructing discriminative models that are fast, gradient-free, and Bayesian.

Chat is not available.