Abstract:In this study, we have shown autonomous long-term prediction with a spintronic physical reservoir. Due to the short-term memory property of the magnetization dynamics, non-linearity arises in the reservoir states which could be used for long-term prediction tasks using simple linear regression for online training. During the prediction stage, the output is directly fed to the input of the reservoir for autonomous prediction. We employ our proposed reservoir for the modeling of the chaotic time series such as Mackey-Glass and dynamic time-series data, such as household building energy loads. Since only the last layer of a RC needs to be trained with linear regression, it is well suited for learning in real time on edge devices. Here we show that a skyrmion based magnetic tunnel junction can potentially be used as a prototypical RC but any nanomagnetic magnetic tunnel junction with nonlinear magnetization behavior can implement such a RC. By comparing our spintronic physical RC approach with state-of-the-art energy load forecasting algorithms, such as LSTMs and RNNs, we conclude that the proposed framework presents good performance in achieving high predictions accuracy, while also requiring low memory and energy both of which are at a premium in hardware resource and power constrained edge applications. Further, the proposed approach is shown to require very small training datasets and at the same time being at least 16X energy efficient compared to the state-of-the-art sequence to sequence LSTM for accurate household load predictions.
Abstract:Recent year has brought considerable advancements in Electric Vehicles (EVs) and associated infrastructures/communications. Intrusion Detection Systems (IDS) are widely deployed for anomaly detection in such critical infrastructures. This paper presents an Interpretable Anomaly Detection System (RX-ADS) for intrusion detection in CAN protocol communication in EVs. Contributions include: 1) window based feature extraction method; 2) deep Autoencoder based anomaly detection method; and 3) adversarial machine learning based explanation generation methodology. The presented approach was tested on two benchmark CAN datasets: OTIDS and Car Hacking. The anomaly detection performance of RX-ADS was compared against the state-of-the-art approaches on these datasets: HIDS and GIDS. The RX-ADS approach presented performance comparable to the HIDS approach (OTIDS dataset) and has outperformed HIDS and GIDS approaches (Car Hacking dataset). Further, the proposed approach was able to generate explanations for detected abnormal behaviors arising from various intrusions. These explanations were later validated by information used by domain experts to detect anomalies. Other advantages of RX-ADS include: 1) the method can be trained on unlabeled data; 2) explanations help experts in understanding anomalies and root course analysis, and also help with AI model debugging and diagnostics, ultimately improving user trust in AI systems.
Abstract:Monitoring traffic in computer networks is one of the core approaches for defending critical infrastructure against cyber attacks. Machine Learning (ML) and Deep Neural Networks (DNNs) have been proposed in the past as a tool to identify anomalies in computer networks. Although detecting these anomalies provides an indication of an attack, just detecting an anomaly is not enough information for a user to understand the anomaly. The black-box nature of off-the-shelf ML models prevents extracting important information that is fundamental to isolate the source of the fault/attack and take corrective measures. In this paper, we introduce the Network Transformer (NeT), a DNN model for anomaly detection that incorporates the graph structure of the communication network in order to improve interpretability. The presented approach has the following advantages: 1) enhanced interpretability by incorporating the graph structure of computer networks; 2) provides a hierarchical set of features that enables analysis at different levels of granularity; 3) self-supervised training that does not require labeled data. The presented approach was tested by evaluating the successful detection of anomalies in an Industrial Control System (ICS). The presented approach successfully identified anomalies, the devices affected, and the specific connections causing the anomalies, providing a data-driven hierarchical approach to analyze the behavior of a cyber network.
Abstract:Centuries of development in natural sciences and mathematical modeling provide valuable domain expert knowledge that has yet to be explored for the development of machine learning models. When modeling complex physical systems, both domain knowledge and data contribute important information about the system. In this paper, we present a data-driven model that takes advantage of partial domain knowledge in order to improve generalization and interpretability. The presented model, which we call EVGP (Explicit Variational Gaussian Process), uses an explicit linear prior to incorporate partial domain knowledge while using data to fill in the gaps in knowledge. Variational inference was used to obtain a sparse approximation that scales well to large datasets. The advantages include: 1) using partial domain knowledge to improve inductive bias (assumptions of the model), 2) scalability to large datasets, 3) improved interpretability. We show how the EVGP model can be used to learn system dynamics using basic Newtonian mechanics as prior knowledge. We demonstrate that using simple priors from partially defined physics models considerably improves performance when compared to fully data-driven models.
Abstract:Despite the growing popularity of modern machine learning techniques (e.g. Deep Neural Networks) in cyber-security applications, most of these models are perceived as a black-box for the user. Adversarial machine learning offers an approach to increase our understanding of these models. In this paper we present an approach to generate explanations for incorrect classifications made by data-driven Intrusion Detection Systems (IDSs). An adversarial approach is used to find the minimum modifications (of the input features) required to correctly classify a given set of misclassified samples. The magnitude of such modifications is used to visualize the most relevant features that explain the reason for the misclassification. The presented methodology generated satisfactory explanations that describe the reasoning behind the mis-classifications, with descriptions that match expert knowledge. The advantages of the presented methodology are: 1) applicable to any classifier with defined gradients. 2) does not require any modification of the classifier model. 3) can be extended to perform further diagnosis (e.g. vulnerability assessment) and gain further understanding of the system. Experimental evaluation was conducted on the NSL-KDD99 benchmark dataset using Linear and Multilayer perceptron classifiers. The results are shown using intuitive visualizations in order to improve the interpretability of the results.
Abstract:Ensuring sustainability demands more efficient energy management with minimized energy wastage. Therefore, the power grid of the future should provide an unprecedented level of flexibility in energy management. To that end, intelligent decision making requires accurate predictions of future energy demand/load, both at aggregate and individual site level. Thus, energy load forecasting have received increased attention in the recent past, however has proven to be a difficult problem. This paper presents a novel energy load forecasting methodology based on Deep Neural Networks, specifically Long Short Term Memory (LSTM) algorithms. The presented work investigates two variants of the LSTM: 1) standard LSTM and 2) LSTM-based Sequence to Sequence (S2S) architecture. Both methods were implemented on a benchmark data set of electricity consumption data from one residential customer. Both architectures where trained and tested on one hour and one-minute time-step resolution datasets. Experimental results showed that the standard LSTM failed at one-minute resolution data while performing well in one-hour resolution data. It was shown that S2S architecture performed well on both datasets. Further, it was shown that the presented methods produced comparable results with the other deep learning methods for energy forecasting in literature.
Abstract:Trajectory simplification is a problem encountered in areas like Robot programming by demonstration, CAD/CAM, computer vision, and in GPS-based applications like traffic analysis. This problem entails reduction of the points in a given trajectory while keeping the relevant points which preserve important information. The benefits include storage reduction, computational expense, while making data more manageable. Common techniques formulate a minimization problem to be solved, where the solution is found iteratively under some error metric, which causes the algorithms to work in super-linear time. We present an algorithm called FastSTray, which selects the relevant points in the trajectory in linear time by following an open loop heuristic approach. While most current trajectory simplification algorithms are tailored for GPS trajectories, our approach focuses on smooth trajectories for robot programming by demonstration recorded using motion capture systems.Two variations of the algorithm are presented: 1. aims to preserve shape and temporal information; 2. preserves only shape information. Using the points in the simplified trajectory we use cubic splines to interpolate between these points and recreate the original trajectory. The presented algorithm was tested on trajectories recorded from a hand-tracking system. It was able to eliminate about 90% of the points in the original trajectories while maintaining errors between 0.78-2cm which corresponds to 1%-2.4% relative error with respect to the bounding box of the trajectories.