Abstract:Bayesian coresets approximate a posterior distribution by building a small weighted subset of the data points. Any inference procedure that is too computationally expensive to be run on the full posterior can instead be run inexpensively on the coreset, with results that approximate those on the full data. However, current approaches are limited by either a significant run-time or the need for the user to specify a low-cost approximation to the full posterior. We propose a Bayesian coreset construction algorithm that first selects a uniformly random subset of data, and then optimizes the weights using a novel quasi-Newton method. Our algorithm is simple to implement, does not require the user to specify a low-cost posterior approximation, and is the first to come with a general high-probability bound on the KL divergence of the output coreset posterior. Experiments demonstrate that the method provides orders of magnitude improvement in construction time against the state-of-the-art black-box method. Moreover, it provides significant improvements in coreset quality against alternatives with comparable construction times, with far less storage cost and user input required.