Abstract:Remaining useful life prediction plays a crucial role in the health management of industrial systems. Given the increasing complexity of systems, data-driven predictive models have attracted significant research interest. Upon reviewing the existing literature, it appears that many studies either do not fully integrate both spatial and temporal features or employ only a single attention mechanism. Furthermore, there seems to be inconsistency in the choice of data normalization methods, particularly concerning operating conditions, which might influence predictive performance. To bridge these observations, this study presents the Spatio-Temporal Attention Graph Neural Network. Our model combines graph neural networks and temporal convolutional neural networks for spatial and temporal feature extraction, respectively. The cascade of these extractors, combined with multi-head attention mechanisms for both spatio-temporal dimensions, aims to improve predictive precision and refine model explainability. Comprehensive experiments were conducted on the C-MAPSS dataset to evaluate the impact of unified versus clustering normalization. The findings suggest that our model performs state-of-the-art results using only the unified normalization. Additionally, when dealing with datasets with multiple operating conditions, cluster normalization enhances the performance of our proposed model by up to 27%.
Abstract:Continual learning can enable neural networks to evolve by learning new tasks sequentially in task-changing scenarios. However, two general and related challenges should be overcome in further research before we apply this technique to real-world applications. Firstly, newly collected novelties from the data stream in applications could contain anomalies that are meaningless for continual learning. Instead of viewing them as a new task for updating, we have to filter out such anomalies to reduce the disturbance of extremely high-entropy data for the progression of convergence. Secondly, fewer efforts have been put into research regarding the explainability of continual learning, which leads to a lack of transparency and credibility of the updated neural networks. Elaborated explanations about the process and result of continual learning can help experts in judgment and making decisions. Therefore, we propose the conceptual design of an explainability module with experts in the loop based on techniques, such as dimension reduction, visualization, and evaluation strategies. This work aims to overcome the mentioned challenges by sufficiently explaining and visualizing the identified anomalies and the updated neural network. With the help of this module, experts can be more confident in decision-making regarding anomaly filtering, dynamic adjustment of hyperparameters, data backup, etc.
Abstract:Compared with traditional deep learning techniques, continual learning enables deep neural networks to learn continually and adaptively. Deep neural networks have to learn new tasks and overcome forgetting the knowledge obtained from the old tasks as the amount of data keeps increasing in applications. In this article, two continual learning scenarios will be proposed to describe the potential challenges in this context. Besides, based on our previous work regarding the CLeaR framework, which is short for continual learning for regression tasks, the work will be further developed to enable models to extend themselves and learn data successively. Research topics are related but not limited to developing continual deep learning algorithms, strategies for non-stationarity detection in data streams, explainable and visualizable artificial intelligence, etc. Moreover, the framework- and algorithm-related hyperparameters should be dynamically updated in applications. Forecasting experiments will be conducted based on power generation and consumption data collected from real-world applications. A series of comprehensive evaluation metrics and visualization tools can help analyze the experimental results. The proposed framework is expected to be generally applied to other constantly changing scenarios.
Abstract:Catastrophic forgetting means that a trained neural network model gradually forgets the previously learned tasks when retrained on new tasks. Overcoming the forgetting problem is a major problem in machine learning. Numerous continual learning algorithms are very successful in incremental classification tasks, where new labels appear frequently. However, there is currently no research that addresses the catastrophic forgetting problem in regression tasks as far as we know. This problem has emerged as one of the primary constraints in some applications, such as renewable energy forecasts. This article clarifies the problem-related definitions and proposes a new methodological framework that can forecast regression task targets and update itself by continual learning. The framework consists of forecasting neural networks and buffers, which store newly collected data from a data stream in an application. Changes in the probability distribution of the data stream will be identified by the framework and learned sequentially. The framework is called CLeaR (Continual Learning for Regression Tasks), where components can be flexibly customized for a specific application scenario. We design two sets of experiments to evaluate the CLeaR framework concerning fitting error (training), prediction error (test), and forgetting ratio. The first one is based on an artificial time series to explore how hyperparameters affect the CLeaR framework. The second one is designed with data collected from European wind farms to evaluate the performance of the CLeaR framework in a real-world application. The experimental results demonstrate that the CLeaR framework can efficiently accumulate knowledge in the data stream and improve the prediction accuracy. The article concludes with further research issues arising from requirements to extend the framework.