Abstract:Task embeddings in multi-layer perceptrons for multi-task learning and inductive transfer learning in renewable power forecasts have recently been introduced. In many cases, this approach improves the forecast error and reduces the required training data. However, it does not take the seasonal influences in power forecasts within a day into account, i.e., the diurnal cycle. Therefore, we extended this idea to temporal convolutional networks to consider those seasonalities. We propose transforming the embedding space, which contains the latent similarities between tasks, through convolution and providing these results to the network's residual block. The proposed architecture significantly improves up to 25 percent for multi-task learning for power forecasts on the EuropeWindFarm and GermanSolarFarm dataset compared to the multi-layer perceptron approach. Based on the same data, we achieve a ten percent improvement for the wind datasets and more than 20 percent in most cases for the solar dataset for inductive transfer learning without catastrophic forgetting. Finally, we are the first proposing zero-shot learning for renewable power forecasts to provide predictions even if no training data is available.
Abstract:There is recent interest in using model hubs, a collection of pre-trained models, in computer vision tasks. To utilize the model hub, we first select a source model and then adapt the model for the target to compensate for differences. While there is yet limited research on a model selection and adaption for computer vision tasks, this holds even more for the field of renewable power. At the same time, it is a crucial challenge to provide forecasts for the increasing demand for power forecasts based on weather features from a numerical weather prediction. We close these gaps by conducting the first thorough experiment for model selection and adaptation for transfer learning in renewable power forecast, adopting recent results from the field of computer vision on six datasets. We adopt models based on data from different seasons and limit the amount of training data. As an extension of the current state of the art, we utilize a Bayesian linear regression for forecasting the response based on features extracted from a neural network. This approach outperforms the baseline with only seven days of training data. We further show how combining multiple models through ensembles can significantly improve the model selection and adaptation approach. In fact, with more than 30 days of training data, both proposed model combination techniques achieve similar results to those models trained with a full year of training data.
Abstract:Photovoltaic and wind power forecasts in power systems with a high share of renewable energy are essential in several applications. These include stable grid operation, profitable power trading, and forward-looking system planning. However, there is a lack of publicly available datasets for research on machine learning based prediction methods. This paper provides an openly accessible time series dataset with realistic synthetic power data. Other publicly and non-publicly available datasets often lack precise geographic coordinates, timestamps, or static power plant information, e.g., to protect business secrets. On the opposite, this dataset provides these. The dataset comprises 120 photovoltaic and 273 wind power plants with distinct sides all over Germany from 500 days in hourly resolution. This large number of available sides allows forecasting experiments to include spatial correlations and run experiments in transfer and multi-task learning. It includes side-specific, power source-dependent, non-synthetic input features from the ICON-EU weather model. A simulation of virtual power plants with physical models and actual meteorological measurements provides realistic synthetic power measurement time series. These time series correspond to the power output of virtual power plants at the location of the respective weather measurements. Since the synthetic time series are based exclusively on weather measurements, possible errors in the weather forecast are comparable to those in actual power data. In addition to the data description, we evaluate the quality of weather-prediction-based power forecasts by comparing simplified physical models and a machine learning model. This experiment shows that forecasts errors on the synthetic power data are comparable to real-world historical power measurements.
Abstract:In this article, we present a novel approach to multivariate probabilistic forecasting. Our approach is based on an extension of single-output quantile regression (QR) to multivariate-targets, called quantile surfaces (QS). QS uses a simple yet compelling idea of indexing observations of a probabilistic forecast through direction and vector length to estimate a central tendency. We extend the single-output QR technique to multivariate probabilistic targets. QS efficiently models dependencies in multivariate target variables and represents probability distributions through discrete quantile levels. Therefore, we present a novel two-stage process. In the first stage, we perform a deterministic point forecast (i.e., central tendency estimation). Subsequently, we model the prediction uncertainty using QS involving neural networks called quantile surface regression neural networks (QSNN). Additionally, we introduce new methods for efficient and straightforward evaluation of the reliability and sharpness of the issued probabilistic QS predictions. We complement this by the directional extension of the Continuous Ranked Probability Score (CRPS) score. Finally, we evaluate our novel approach on synthetic data and two currently researched real-world challenges in two different domains: First, probabilistic forecasting for renewable energy power generation, second, short-term cyclists trajectory forecasting for autonomously driving vehicles. Especially for the latter, our empirical results show that even a simple one-layer QSNN outperforms traditional parametric multivariate forecasting techniques, thus improving the state-of-the-art performance.
Abstract:Multi-task learning (mtl) provides state-of-the-art results in many applications of computer vision and natural language processing. In contrast to single-task learning (stl), mtl allows for leveraging knowledge between related tasks improving prediction results on the main task (in contrast to an auxiliary task) or all tasks. However, there is a limited number of comparative studies on applying mtl architectures for regression and time series problems taking recent advances of mtl into account. An interesting, non-linear problem is the forecast of the expected power generation for renewable power plants. Therefore, this article provides a comparative study of the following recent and important mtl architectures: Hard parameter sharing, cross-stitch network, sluice network (sn). They are compared to a multi-layer perceptron model of similar size in an stl setting. Additionally, we provide a simple, yet effective approach to model task specific information through an embedding layer in an multi-layer perceptron, referred to as task embedding. Further, we introduce a new mtl architecture named emerging relation network (ern), which can be considered as an extension of the sluice network. For a solar power dataset, the task embedding achieves the best mean improvement with 14.9%. The mean improvement of the ern and the sn on the solar dataset is of similar magnitude with 14.7% and 14.8%. On a wind power dataset, only the ern achieves a significant improvement of up to 7.7%. Results suggest that the ern is beneficial when tasks are only loosely related and the prediction problem is more non-linear. Contrary, the proposed task embedding is advantageous when tasks are strongly correlated. Further, the task embedding provides an effective approach with reduced computational effort compared to other mtl architectures.
Abstract:This article is about an extension of a recent ensemble method called Coopetitive Soft Gating Ensemble (CSGE) and its application on power forecasting as well as motion primitive forecasting of cyclists. The CSGE has been used successfully in the field of wind power forecasting, outperforming common algorithms in this domain. The principal idea of the CSGE is to weight the models regarding their observed performance during training on different aspects. Several extensions are proposed to the original CSGE within this article, making the ensemble even more flexible and powerful. The extended CSGE (XCSGE as we term it), is used to predict the power generation on both wind- and solar farms. Moreover, the XCSGE is applied to forecast the movement state of cyclists in the context of driver assistance systems. Both domains have different requirements, are non-trivial problems, and are used to evaluate various facets of the novel XCSGE. The two problems differ fundamentally in the size of the data sets and the number of features. Power forecasting is based on weather forecasts that are subject to fluctuations in their features. In the movement primitive forecasting of cyclists, time delays contribute to the difficulty of the prediction. The XCSGE reaches an improvement of the prediction performance of up to 11% for wind power forecasting and 30% for solar power forecasting compared to the worst performing model. For the classification of movement primitives of cyclists, the XCSGE reaches an improvement of up to 28%. The evaluation includes a comparison with other state-of-the-art ensemble methods. We can verify that the XCSGE results are significantly better using the Nemenyi post-hoc test.
Abstract:In recent years, transfer learning gained particular interest in the field of vision and natural language processing. In the research field of vision, e.g., deep neural networks and transfer learning techniques achieve almost perfect classification scores within minutes. Nonetheless, these techniques are not yet widely applied in other domains. Therefore, this article identifies critical challenges and shows potential solutions for power forecasts in the field of renewable energies. It proposes a framework utilizing transfer learning techniques in wind power forecasts with limited or no historical data. On the one hand, this allows evaluating the applicability of transfer learning in the field of renewable energy. On the other hand, by developing automatic procedures, we assure that the proposed methods provide a framework that applies to domains in organic computing as well.
Abstract:For the integration of renewable energy sources, power grid operators need realistic information about the effects of energy production and consumption to assess grid stability. Recently, research in scenario planning benefits from utilizing generative adversarial networks (GANs) as generative models for operational scenario planning. In these scenarios, operators examine temporal as well as spatial influences of different energy sources on the grid. The analysis of how renewable energy resources affect the grid enables the operators to evaluate the stability and to identify potential weak points such as a limiting transformer. However, due to their novelty, there are limited studies on how well GANs model the underlying power distribution. This analysis is essential because, e.g., especially extreme situations with low or high power generation are required to evaluate grid stability. We conduct a comparative study of the Wasserstein distance, binary-cross-entropy loss, and a Gaussian copula as the baseline applied on two wind and two solar datasets with limited data compared to previous studies. Both GANs achieve good results considering the limited amount of data, but the Wasserstein GAN is superior in modeling temporal and spatial relations, and the power distribution. Besides evaluating the generated power distribution over all farms, it is essential to assess terrain specific distributions for wind scenarios. These terrain specific power distributions affect the grid by their differences in their generating power magnitude. Therefore, in a second study, we show that even when simultaneously learning distributions from wind parks with terrain specific patterns, GANs are capable of modeling these individualities also when faced with limited data.
Abstract:Despite the increasing importance of forecasts of renewable energy, current planning studies only address a general estimate of the forecast quality to be expected and selected forecast horizons. However, these estimates allow only a limited and highly uncertain use in the planning of electric power distribution. More reliable planning processes require considerably more information about future forecast quality. In this article, we present an in-depth analysis and comparison of influencing factors regarding uncertainty in wind and photovoltaic power forecasts, based on four different machine learning (ML) models. In our analysis, we found substantial differences in uncertainty depending on ML models, data coverage, and seasonal patterns that have to be considered in future planning studies.
Abstract:In this article, we propose the Coopetititve Soft Gating Ensemble or CSGE for general machine learning tasks and interwoven systems. The goal of machine learning is to create models that generalize well for unknown datasets. Often, however, the problems are too complex to be solved with a single model, so several models are combined. Similar, Autonomic Computing requires the integration of different systems. Here, especially, the local, temporal online evaluation and the resulting (re-)weighting scheme of the CSGE makes the approach highly applicable for self-improving system integrations. To achieve the best potential performance the CSGE can be optimized according to arbitrary loss functions making it accessible for a broader range of problems. We introduce a novel training procedure including a hyper-parameter initialisation at its heart. We show that the CSGE approach reaches state-of-the-art performance for both classification and regression tasks. Further on, the CSGE provides a human-readable quantification on the influence of all base estimators employing the three weighting aspects. Moreover, we provide a scikit-learn compatible implementation.