Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Katherine A. Heller

Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence

Jun 30, 2021

Ghassen Jerfel, Serena Wang, Clara Fannjiang, Katherine A. Heller, Yian Ma, Michael I. Jordan

Figure 1 for Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence

Figure 2 for Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence

Figure 3 for Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence

Figure 4 for Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence

Abstract:Variational Inference (VI) is a popular alternative to asymptotically exact sampling in Bayesian inference. Its main workhorse is optimization over a reverse Kullback-Leibler divergence (RKL), which typically underestimates the tail of the posterior leading to miscalibration and potential degeneracy. Importance sampling (IS), on the other hand, is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures. The quality of IS crucially depends on the choice of the proposal distribution. Ideally, the proposal distribution has heavier tails than the target, which is rarely achievable by minimizing the RKL. We thus propose a novel combination of optimization and sampling techniques for approximate Bayesian inference by constructing an IS proposal distribution through the minimization of a forward KL (FKL) divergence. This approach guarantees asymptotic consistency and a fast convergence towards both the optimal IS estimator and the optimal variational approximation. We empirically demonstrate on real data that our method is competitive with variational boosting and MCMC.

* Accepted for the 37th Conference on Uncertainty in Artificial Intelligence (UAI 2021)

Via

Access Paper or Ask Questions

An inner-loop free solution to inverse problems using deep neural networks

Nov 14, 2017

Qi Wei, Kai Fan, Lawrence Carin, Katherine A. Heller

Figure 1 for An inner-loop free solution to inverse problems using deep neural networks

Figure 2 for An inner-loop free solution to inverse problems using deep neural networks

Figure 3 for An inner-loop free solution to inverse problems using deep neural networks

Figure 4 for An inner-loop free solution to inverse problems using deep neural networks

Abstract:We propose a new method that uses deep learning techniques to accelerate the popular alternating direction method of multipliers (ADMM) solution for inverse problems. The ADMM updates consist of a proximity operator, a least squares regression that includes a big matrix inversion, and an explicit solution for updating the dual variables. Typically, inner loops are required to solve the first two sub-minimization problems due to the intractability of the prior and the matrix inversion. To avoid such drawbacks or limitations, we propose an inner-loop free update rule with two pre-trained deep convolutional architectures. More specifically, we learn a conditional denoising auto-encoder which imposes an implicit data-dependent prior/regularization on ground-truth in the first sub-minimization problem. This design follows an empirical Bayesian strategy, leading to so-called amortized inference. For matrix inversion in the second sub-problem, we learn a convolutional neural network to approximate the matrix inversion, i.e., the inverse mapping is learned by feeding the input through the learned forward network. Note that training this neural network does not require ground-truth or measurements, i.e., it is data-independent. Extensive experiments on both synthetic data and real datasets demonstrate the efficiency and accuracy of the proposed method compared with the conventional ADMM solution using inner loops for solving inverse problems.

* 2017 Conference on Neural Information Processing Systems (NIPS)

Via

Access Paper or Ask Questions

Parallelizing MCMC with Random Partition Trees

Oct 26, 2015

Xiangyu Wang, Fangjian Guo, Katherine A. Heller, David B. Dunson

Figure 1 for Parallelizing MCMC with Random Partition Trees

Figure 2 for Parallelizing MCMC with Random Partition Trees

Figure 3 for Parallelizing MCMC with Random Partition Trees

Figure 4 for Parallelizing MCMC with Random Partition Trees

Abstract:The modern scale of data has brought new challenges to Bayesian inference. In particular, conventional MCMC algorithms are computationally very expensive for large data sets. A promising approach to solve this problem is embarrassingly parallel MCMC (EP-MCMC), which first partitions the data into multiple subsets and runs independent sampling algorithms on each subset. The subset posterior draws are then aggregated via some combining rules to obtain the final approximation. Existing EP-MCMC algorithms are limited by approximation accuracy and difficulty in resampling. In this article, we propose a new EP-MCMC algorithm PART that solves these problems. The new algorithm applies random partition trees to combine the subset posterior draws, which is distribution-free, easy to resample from and can adapt to multiple scales. We provide theoretical justification and extensive experiments illustrating empirical performance.

* 25 pages, 9 figures

Via

Access Paper or Ask Questions

Bayesian Rose Trees

Mar 15, 2012

Charles Blundell, Yee Whye Teh, Katherine A. Heller

Abstract:Hierarchical structure is ubiquitous in data across many domains. There are many hierarchical clustering methods, frequently used by domain experts, which strive to discover this structure. However, most of these methods limit discoverable hierarchies to those with binary branching structure. This limitation, while computationally convenient, is often undesirable. In this paper we explore a Bayesian hierarchical clustering algorithm that can produce trees with arbitrary branching structure at each node, known as rose trees. We interpret these trees as mixtures over partitions of a data set, and use a computationally efficient, greedy agglomerative algorithm to find the rose trees which have high marginal likelihood given the data. Lastly, we perform experiments which demonstrate that rose trees are better models of data than the typical binary trees returned by other hierarchical clustering algorithms.

* Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

Via

Access Paper or Ask Questions