Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Gradient-free variational learning with conditional mixture networks

Aug 29, 2024

Conor Heins, Hao Wu, Dimitrije Markovic, Alexander Tschantz, Jeff Beck, Christopher Buckley

Figure 1 for Gradient-free variational learning with conditional mixture networks

Figure 2 for Gradient-free variational learning with conditional mixture networks

Figure 3 for Gradient-free variational learning with conditional mixture networks

Figure 4 for Gradient-free variational learning with conditional mixture networks

Share this with someone who'll enjoy it:

Abstract:Balancing computational efficiency with robust predictive performance is crucial in supervised learning, especially for critical applications. Standard deep learning models, while accurate and scalable, often lack probabilistic features like calibrated predictions and uncertainty quantification. Bayesian methods address these issues but can be computationally expensive as model and data complexity increase. Previous work shows that fast variational methods can reduce the compute requirements of Bayesian methods by eliminating the need for gradient computation or sampling, but are often limited to simple models. We demonstrate that conditional mixture networks (CMNs), a probabilistic variant of the mixture-of-experts (MoE) model, are suitable for fast, gradient-free inference and can solve complex classification tasks. CMNs employ linear experts and a softmax gating network. By exploiting conditional conjugacy and P\'olya-Gamma augmentation, we furnish Gaussian likelihoods for the weights of both the linear experts and the gating network. This enables efficient variational updates using coordinate ascent variational inference (CAVI), avoiding traditional gradient-based optimization. We validate this approach by training two-layer CMNs on standard benchmarks from the UCI repository. Our method, CAVI-CMN, achieves competitive and often superior predictive accuracy compared to maximum likelihood estimation (MLE) with backpropagation, while maintaining competitive runtime and full posterior distributions over all model parameters. Moreover, as input size or the number of experts increases, computation time scales competitively with MLE and other gradient-based solutions like black-box variational inference (BBVI), making CAVI-CMN a promising tool for deep, fast, and gradient-free Bayesian networks.

* 16 pages main text (3 figures), including references. 9 pages supplementary material (5 figures)

View paper on

Share this with someone who'll enjoy it:

Title:Gradient-free variational learning with conditional mixture networks

Paper and Code