Abstract:We present a real-world application that uses a quantum computer. Specifically, we trained a Restricted Boltzmann Machine (RBM) using quantum annealing (QA) to develop an intrusion detection system. RBMs were trained on the ISCX data, which is a benchmark dataset for cybersecurity. For comparison, RBMs were also trained using contrastive divergence (CD) which is a classical method. D-Wave's 2000Q quantum annealer has been used to implement QA. Our analysis of the ISCX data shows that the dataset is imbalanced and we present two different schemes to balance the training dataset before feeding it to a classifier. The first scheme is based on the oversampling of attack instances. The imbalanced training dataset was divided into five sub-datasets that were trained separately. A majority voting was performed to get the final result. Our results show the majority vote increased the classification accuracy up from 90.24% to 95.68% in the case of CD. For the case of QA, the classification accuracy increased from 74.14% to 80.04%. In the second scheme, an RBM was used to generate synthetic data to balance the training dataset. The RBMs trained on synthetic data generated from a CD-trained RBM performed comparably to the RBMs trained on synthetic data generated from a QA-trained RBM. Balanced training data was used to evaluate several classifiers. Among the classifiers investigated, K-Nearest Neighbor (KNN) and Neural Network (NN) performed better than other classifiers. They both showed an accuracy of 93%. Our results show a proof of concept that a QA-based RBM can be trained on a binary dataset, with 64-bit records. The illustrative example suggests the possibility to migrate many practical classification problems to QA-based techniques.
Abstract:Restricted Boltzmann Machine (RBM) is an energy based, undirected graphical model. It is commonly used for unsupervised and supervised machine learning. Typically, RBM is trained using contrastive divergence (CD). However, training with CD is slow and does not estimate exact gradient of log-likelihood cost function. In this work, the model expectation of gradient learning for RBM has been calculated using a quantum annealer (D-Wave 2000Q), which is much faster than Markov chain Monte Carlo (MCMC) used in CD. Training and classification results are compared with CD. The classification accuracy results indicate similar performance of both methods. Image reconstruction as well as log-likelihood calculations are used to compare the performance of quantum and classical algorithms for RBM training. It is shown that the samples obtained from quantum annealer can be used to train a RBM on a 64-bit `bars and stripes' data set with classification performance similar to a RBM trained with CD. Though training based on CD showed improved learning performance, training using a quantum annealer eliminates computationally expensive MCMC steps of CD.