Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Clemens Dubslaff

Explaining Control Policies through Predicate Decision Diagrams

Mar 09, 2025

Debraj Chakraborty, Clemens Dubslaff, Sudeep Kanav, Jan Kretinsky, Christoph Weinhuber

Abstract:Safety-critical controllers of complex systems are hard to construct manually. Automated approaches such as controller synthesis or learning provide a tempting alternative but usually lack explainability. To this end, learning decision trees (DTs) have been prevalently used towards an interpretable model of the generated controllers. However, DTs do not exploit shared decision-making, a key concept exploited in binary decision diagrams (BDDs) to reduce their size and thus improve explainability. In this work, we introduce predicate decision diagrams (PDDs) that extend BDDs with predicates and thus unite the advantages of DTs and BDDs for controller representation. We establish a synthesis pipeline for efficient construction of PDDs from DTs representing controllers, exploiting reduction techniques for BDDs also for PDDs.

Via

Access Paper or Ask Questions

More for Less: Safe Policy Improvement With Stronger Performance Guarantees

May 13, 2023

Patrick Wienhöft, Marnix Suilen, Thiago D. Simão, Clemens Dubslaff, Christel Baier, Nils Jansen

Figure 1 for More for Less: Safe Policy Improvement With Stronger Performance Guarantees

Figure 2 for More for Less: Safe Policy Improvement With Stronger Performance Guarantees

Figure 3 for More for Less: Safe Policy Improvement With Stronger Performance Guarantees

Figure 4 for More for Less: Safe Policy Improvement With Stronger Performance Guarantees

Abstract:In an offline reinforcement learning setting, the safe policy improvement (SPI) problem aims to improve the performance of a behavior policy according to which sample data has been generated. State-of-the-art approaches to SPI require a high number of samples to provide practical probabilistic guarantees on the improved policy's performance. We present a novel approach to the SPI problem that provides the means to require less data for such guarantees. Specifically, to prove the correctness of these guarantees, we devise implicit transformations on the data set and the underlying environment model that serve as theoretical foundations to derive tighter improvement bounds for SPI. Our empirical evaluation, using the well-established SPI with baseline bootstrapping (SPIBB) algorithm, on standard benchmarks shows that our method indeed significantly reduces the sample complexity of the SPIBB algorithm.

* Accecpted at IJCAI 2023

Via

Access Paper or Ask Questions

Strategy Synthesis in Markov Decision Processes Under Limited Sampling Access

Mar 22, 2023

Christel Baier, Clemens Dubslaff, Patrick Wienhöft, Stefan J. Kiebel

Abstract:A central task in control theory, artificial intelligence, and formal methods is to synthesize reward-maximizing strategies for agents that operate in partially unknown environments. In environments modeled by gray-box Markov decision processes (MDPs), the impact of the agents' actions are known in terms of successor states but not the stochastics involved. In this paper, we devise a strategy synthesis algorithm for gray-box MDPs via reinforcement learning that utilizes interval MDPs as internal model. To compete with limited sampling access in reinforcement learning, we incorporate two novel concepts into our algorithm, focusing on rapid and successful learning rather than on stochastic guarantees and optimality: lower confidence bound exploration reinforces variants of already learned practical strategies and action scoping reduces the learning action space to promising actions. We illustrate benefits of our algorithms by means of a prototypical implementation applied on examples from the AI and formal methods communities.

* Accepted for publication at NASA Formal Methods (NFM) 2023. This is an extended version with the full appendix containing proofs, further pseudocode with explanations and additional experiment figures

Via

Access Paper or Ask Questions

On the Foundations of Cycles in Bayesian Networks

Jan 20, 2023

Christel Baier, Clemens Dubslaff, Holger Hermanns, Nikolai Käfer

Abstract:Bayesian networks (BNs) are a probabilistic graphical model widely used for representing expert knowledge and reasoning under uncertainty. Traditionally, they are based on directed acyclic graphs that capture dependencies between random variables. However, directed cycles can naturally arise when cross-dependencies between random variables exist, e.g., for modeling feedback loops. Existing methods to deal with such cross-dependencies usually rely on reductions to BNs without cycles. These approaches are fragile to generalize, since their justifications are intermingled with additional knowledge about the application context. In this paper, we present a foundational study regarding semantics for cyclic BNs that are generic and conservatively extend the cycle-free setting. First, we propose constraint-based semantics that specify requirements for full joint distributions over a BN to be consistent with the local conditional probabilities and independencies. Second, two kinds of limit semantics that formalize infinite unfolding approaches are introduced and shown to be computable by a Markov chain construction.

* Principles of Systems Design. Lecture Notes in Computer Science, vol 13660, pp 343-363, 2022
* Full version with an appendix containing the proofs

Via

Access Paper or Ask Questions