Abstract:Probabilistic models based on Restricted Boltzmann Machines (RBMs) imply the evaluation of normalized Boltzmann factors, which in turn require from the evaluation of the partition function Z. The exact evaluation of Z, though, becomes a forbiddingly expensive task as the system size increases. This even worsens when one considers most usual learning algorithms for RBMs, where the exact evaluation of the gradient of the log-likelihood of the empirical distribution of the data includes the computation of Z at each iteration. The Annealed Importance Sampling (AIS) method provides a tool to stochastically estimate the partition function of the system. So far, the standard use of the AIS algorithm in the Machine Learning context has been done using a large number of Monte Carlo steps. In this work we show that this may not be required if a proper starting probability distribution is employed as the initialization of the AIS algorithm. We analyze the performance of AIS in both small- and large-sized problems, and show that in both cases a good estimation of Z can be obtained with little computational cost.
Abstract:Restricted Boltzmann Machines (RBMs) are general unsupervised learning devices to ascertain generative models of data distributions. RBMs are often trained using the Contrastive Divergence learning algorithm (CD), an approximation to the gradient of the data log-likelihood. A simple reconstruction error is often used to decide whether the approximation provided by the CD algorithm is good enough, though several authors (Schulz et al., 2010; Fischer & Igel, 2010) have raised doubts concerning the feasibility of this procedure. However, not many alternatives to the reconstruction error have been used in the literature. In this manuscript we investigate simple alternatives to the reconstruction error in order to detect as soon as possible the decrease in the log-likelihood during learning.