Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Eichelbeck

PyTupli: A Scalable Infrastructure for Collaborative Offline Reinforcement Learning Projects

May 22, 2025

Hannah Markgraf, Michael Eichelbeck, Daria Cappey, Selin Demirtürk, Yara Schattschneider, Matthias Althoff

Abstract:Offline reinforcement learning (RL) has gained traction as a powerful paradigm for learning control policies from pre-collected data, eliminating the need for costly or risky online interactions. While many open-source libraries offer robust implementations of offline RL algorithms, they all rely on datasets composed of experience tuples consisting of state, action, next state, and reward. Managing, curating, and distributing such datasets requires suitable infrastructure. Although static datasets exist for established benchmark problems, no standardized or scalable solution supports developing and sharing datasets for novel or user-defined benchmarks. To address this gap, we introduce PyTupli, a Python-based tool to streamline the creation, storage, and dissemination of benchmark environments and their corresponding tuple datasets. PyTupli includes a lightweight client library with defined interfaces for uploading and retrieving benchmarks and data. It supports fine-grained filtering at both the episode and tuple level, allowing researchers to curate high-quality, task-specific datasets. A containerized server component enables production-ready deployment with authentication, access control, and automated certificate provisioning for secure use. By addressing key barriers in dataset infrastructure, PyTupli facilitates more collaborative, reproducible, and scalable offline RL research.

Via

Access Paper or Ask Questions

Predicting building types and functions at transnational scale

Sep 15, 2024

Jonas Fill, Michael Eichelbeck, Michael Ebner

Abstract:Building-specific knowledge such as building type and function information is important for numerous energy applications. However, comprehensive datasets containing this information for individual households are missing in many regions of Europe. For the first time, we investigate whether it is feasible to predict building types and functional classes at a European scale based on only open GIS datasets available across countries. We train a graph neural network (GNN) classifier on a large-scale graph dataset consisting of OpenStreetMap (OSM) buildings across the EU, Norway, Switzerland, and the UK. To efficiently perform training using the large-scale graph, we utilize localized subgraphs. A graph transformer model achieves a high Cohen's kappa coefficient of 0.754 when classifying buildings into 9 classes, and a very high Cohen's kappa coefficient of 0.844 when classifying buildings into the residential and non-residential classes. The experimental results imply three core novel contributions to literature. Firstly, we show that building classification across multiple countries is possible using a multi-source dataset consisting of information about 2D building shape, land use, degree of urbanization, and countries as input, and OSM tags as ground truth. Secondly, our results indicate that GNN models that consider contextual information about building neighborhoods improve predictive performance compared to models that only consider individual buildings and ignore the neighborhood. Thirdly, we show that training with GNNs on localized subgraphs instead of standard GNNs improves performance for the task of building classification.

Via

Access Paper or Ask Questions

Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking

Jun 06, 2024

Roland Stolz, Hanna Krasowski, Jakob Thumm, Michael Eichelbeck, Philipp Gassert, Matthias Althoff

Figure 1 for Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking

Figure 2 for Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking

Figure 3 for Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking

Figure 4 for Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking

Abstract:Continuous action spaces in reinforcement learning (RL) are commonly defined as interval sets. While intervals usually reflect the action boundaries for tasks well, they can be challenging for learning because the typically large global action space leads to frequent exploration of irrelevant actions. Yet, little task knowledge can be sufficient to identify significantly smaller state-specific sets of relevant actions. Focusing learning on these relevant actions can significantly improve training efficiency and effectiveness. In this paper, we propose to focus learning on the set of relevant actions and introduce three continuous action masking methods for exactly mapping the action space to the state-dependent set of relevant actions. Thus, our methods ensure that only relevant actions are executed, enhancing the predictability of the RL agent and enabling its use in safety-critical applications. We further derive the implications of the proposed methods on the policy gradient. Using Proximal Policy Optimization (PPO), we evaluate our methods on three control tasks, where the relevant action set is computed based on the system dynamics and a relevant state set. Our experiments show that the three action masking methods achieve higher final rewards and converge faster than the baseline without action masking.

Via

Access Paper or Ask Questions

CommonPower: Supercharging Machine Learning for Smart Grids

Jun 05, 2024

Michael Eichelbeck, Hannah Markgraf, Matthias Althoff

Figure 1 for CommonPower: Supercharging Machine Learning for Smart Grids

Figure 2 for CommonPower: Supercharging Machine Learning for Smart Grids

Figure 3 for CommonPower: Supercharging Machine Learning for Smart Grids

Figure 4 for CommonPower: Supercharging Machine Learning for Smart Grids

Abstract:The growing complexity of power system management has led to an increased interest in the use of reinforcement learning (RL). However, no tool for comprehensive and realistic benchmarking of RL in smart grids exists. One prerequisite for such a comparison is a safeguarding mechanism since vanilla RL controllers can not guarantee the satisfaction of system constraints. Other central requirements include flexible modeling of benchmarking scenarios, credible baselines, and the possibility to investigate the impact of forecast uncertainties. Our Python tool CommonPower is the first modular framework addressing these needs. CommonPower offers a unified interface for single-agent and multi-agent RL training algorithms and includes a built-in model predictive control approach based on a symbolic representation of the system equations. This makes it possible to combine model predictive controllers with RL controllers in the same system. Leveraging the symbolic system model, CommonPower facilitates the study of safeguarding strategies via the flexible formulation of safety layers. Furthermore equipped with a generic forecasting interface, CommonPower constitutes a versatile tool significantly augmenting the exploration of safe RL controllers in smart grids on several dimensions.

* For the corresponding code repository, see https://github.com/TUMcps/commonpower

Via

Access Paper or Ask Questions

Formal Verification of Graph Convolutional Networks with Uncertain Node Features and Uncertain Graph Structure

Apr 23, 2024

Tobias Ladner, Michael Eichelbeck, Matthias Althoff

Abstract:Graph neural networks are becoming increasingly popular in the field of machine learning due to their unique ability to process data structured in graphs. They have also been applied in safety-critical environments where perturbations inherently occur. However, these perturbations require us to formally verify neural networks before their deployment in safety-critical environments as neural networks are prone to adversarial attacks. While there exists research on the formal verification of neural networks, there is no work verifying the robustness of generic graph convolutional network architectures with uncertainty in the node features and in the graph structure over multiple message-passing steps. This work addresses this research gap by explicitly preserving the non-convex dependencies of all elements in the underlying computations through reachability analysis with (matrix) polynomial zonotopes. We demonstrate our approach on three popular benchmark datasets.

* under review

Via

Access Paper or Ask Questions

Contingency-constrained economic dispatch with safe reinforcement learning

May 12, 2022

Michael Eichelbeck, Hannah Markgraf, Matthias Althoff

Figure 1 for Contingency-constrained economic dispatch with safe reinforcement learning

Figure 2 for Contingency-constrained economic dispatch with safe reinforcement learning

Figure 3 for Contingency-constrained economic dispatch with safe reinforcement learning

Figure 4 for Contingency-constrained economic dispatch with safe reinforcement learning

Abstract:Future power systems will rely heavily on micro grids with a high share of decentralised renewable energy sources and energy storage systems. The high complexity and uncertainty in this context might make conventional power dispatch strategies infeasible. Reinforcement-learning based (RL) controllers can address this challenge, however, cannot themselves provide safety guarantees, preventing their deployment in practice. To overcome this limitation, we propose a formally validated RL controller for economic dispatch. We extend conventional constraints by a time-dependent constraint encoding the islanding contingency. The contingency constraint is computed using set-based backwards reachability analysis and actions of the RL agent are verified through a safety layer. Unsafe actions are projected into the safe action space while leveraging constrained zonotope set representations for computational efficiency. The developed approach is demonstrated on a residential use case using real-world measurements.

Via

Access Paper or Ask Questions