Abstract:Gaussian-Bernoulli restricted Boltzmann machines (GBRBMs) are often used for semi-supervised anomaly detection, where they are trained using only normal data points. In GBRBM-based anomaly detection, normal and anomalous data are classified based on a score that is identical to an energy function of the marginal GBRBM. However, the classification threshold is difficult to set to an appropriate value, as this score cannot be interpreted. In this study, we propose a measure that improves score's interpretability based on its cumulative distribution, and establish a guideline for setting the threshold using the interpretable measure. The results of numerical experiments show that the guideline is reasonable when setting the threshold solely using normal data points. Moreover, because identifying the measure involves computationally infeasible evaluation of the minimum score value, we also propose an evaluation method for the minimum score based on simulated annealing, which is widely used for optimization problems. The proposed evaluation method was also validated using numerical experiments.
Abstract:Although evaluation of the expectations on the Ising model is essential in various applications, this is frequently infeasible because of intractable multiple summations (or integrations). Spatial Monte Carlo integration (SMCI) is a sampling-based approximation, and can provide high-accuracy estimations for such intractable expectations. To evaluate the expectation of a function of variables in a specific region (called target region), SMCI considers a larger region containing the target region (called sum region). In SMCI, the multiple summation for the variables in the sum region is precisely executed, and that in the outer region is evaluated by the sampling approximation such as the standard Monte Carlo integration. It is guaranteed that the accuracy of the SMCI estimator is monotonically improved as the size of the sum region increases. However, a haphazard expansion of the sum region could cause a combinatorial explosion. Therefore, we hope to improve the accuracy without such region expansion. In this study, based on the theory of generalized least squares, a new effective method is proposed by combining multiple SMCI estimators. The validity of the proposed method is demonstrated theoretically and numerically. The results indicate that the proposed method can be effective in the inverse Ising problem (or Boltzmann machine learning).
Abstract:Evaluating expectations on a pairwise Boltzmann machine (PBM) (or Ising model) is important for various applications, including the statistical machine learning. However, in general the evaluation is computationally difficult because it involves intractable multiple summations or integrations; therefore, it requires an approximation. Monte Carlo integration (MCI) is a well-known approximation method; a more effective MCI-like approximation method was proposed recently, called spatial Monte Carlo integration (SMCI). However, the estimations obtained from SMCI (and MCI) tend to perform poorly in PBMs with low temperature owing to degradation of the sampling quality. Annealed importance sampling (AIS) is a type of importance sampling based on Markov chain Monte Carlo methods, and it can suppress performance degradation in low temperature regions by the force of importance weights. In this study, a new method is proposed to evaluate the expectations on PBMs combining AIS and SMCI. The proposed method performs efficiently in both high- and low-temperature regions, which is theoretically and numerically demonstrated.