Abstract:To solve the optimal power flow (OPF) problem, reinforcement learning (RL) emerges as a promising new approach. However, the RL-OPF literature is strongly divided regarding the exact formulation of the OPF problem as an RL environment. In this work, we collect and implement diverse environment design decisions from the literature regarding training data, observation space, episode definition, and reward function choice. In an experimental analysis, we show the significant impact of these environment design options on RL-OPF training performance. Further, we derive some first recommendations regarding the choice of these design decisions. The created environment framework is fully open-source and can serve as a benchmark for future research in the RL-OPF field.
Abstract:Energy markets can provide incentives for undesired behavior of market participants. Multi-agent Reinforcement learning (MARL) is a promising new approach to determine the expected behavior of energy market participants. However, reinforcement learning requires many interactions with the system to converge, and the power system environment often consists of extensive computations, e.g., optimal power flow (OPF) calculation for market clearing. To tackle this complexity, we provide a model of the energy market to a basic MARL algorithm, in form of a learned OPF approximation and explicit market rules. The learned OPF surrogate model makes an explicit solving of the OPF completely unnecessary. Our experiments demonstrate that the model additionally reduces training time by about one order of magnitude, but at the cost of a slightly worse approximation of the Nash equilibrium. Potential applications of our method are market design, more realistic modeling of market participants, and analysis of manipulative behavior.
Abstract:The communication topology is an essential aspect in designing distributed optimization heuristics. It can influence the exploration and exploitation of the search space and thus the optimization performance in terms of solution quality, convergence speed and collaboration costs, all relevant aspects for applications operating critical infrastructure in energy systems. In this work, we present an approach for adapting the communication topology during runtime, based on the principles of simulated annealing. We compare the approach to common static topologies regarding the performance of an exemplary distributed optimization heuristic. Finally, we investigate the correlations between fitness landscape properties and defined performance metrics.
Abstract:Modern cyber-physical systems (CPS), such as our energy infrastructure, are becoming increasingly complex: An ever-higher share of Artificial Intelligence (AI)-based technologies use the Information and Communication Technology (ICT) facet of energy systems for operation optimization, cost efficiency, and to reach CO2 goals worldwide. At the same time, markets with increased flexibility and ever shorter trade horizons enable the multi-stakeholder situation that is emerging in this setting. These systems still form critical infrastructures that need to perform with highest reliability. However, today's CPS are becoming too complex to be analyzed in the traditional monolithic approach, where each domain, e.g., power grid and ICT as well as the energy market, are considered as separate entities while ignoring dependencies and side-effects. To achieve an overall analysis, we introduce the concept for an application of distributed artificial intelligence as a self-adaptive analysis tool that is able to analyze the dependencies between domains in CPS by attacking them. It eschews pre-configured domain knowledge, instead exploring the CPS domains for emergent risk situations and exploitable loopholes in codices, with a focus on rational market actors that exploit the system while still following the market rules.
Abstract:Principles of modern cyber-physical system (CPS) analysis are based on analytical methods that depend on whether safety or liveness requirements are considered. Complexity is abstracted through different techniques, ranging from stochastic modelling to contracts. However, both distributed heuristics and Artificial Intelligence (AI)-based approaches as well as the user perspective or unpredictable effects, such as accidents or the weather, introduce enough uncertainty to warrant reinforcement-learning-based approaches. This paper compares traditional approaches in the domain of CPS modelling and analysis with the AI researcher perspective to exploring unknown complex systems.