Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Debraj Chakraborty

Explaining Control Policies through Predicate Decision Diagrams

Mar 09, 2025

Debraj Chakraborty, Clemens Dubslaff, Sudeep Kanav, Jan Kretinsky, Christoph Weinhuber

Figure 1 for Explaining Control Policies through Predicate Decision Diagrams

Figure 2 for Explaining Control Policies through Predicate Decision Diagrams

Figure 3 for Explaining Control Policies through Predicate Decision Diagrams

Figure 4 for Explaining Control Policies through Predicate Decision Diagrams

Abstract:Safety-critical controllers of complex systems are hard to construct manually. Automated approaches such as controller synthesis or learning provide a tempting alternative but usually lack explainability. To this end, learning decision trees (DTs) have been prevalently used towards an interpretable model of the generated controllers. However, DTs do not exploit shared decision-making, a key concept exploited in binary decision diagrams (BDDs) to reduce their size and thus improve explainability. In this work, we introduce predicate decision diagrams (PDDs) that extend BDDs with predicates and thus unite the advantages of DTs and BDDs for controller representation. We establish a synthesis pipeline for efficient construction of PDDs from DTs representing controllers, exploiting reduction techniques for BDDs also for PDDs.

Via

Access Paper or Ask Questions

Explainable Finite-Memory Policies for Partially Observable Markov Decision Processes

Nov 20, 2024

Muqsit Azeem, Debraj Chakraborty, Sudeep Kanav, Jan Kretinsky

Figure 1 for Explainable Finite-Memory Policies for Partially Observable Markov Decision Processes

Figure 2 for Explainable Finite-Memory Policies for Partially Observable Markov Decision Processes

Figure 3 for Explainable Finite-Memory Policies for Partially Observable Markov Decision Processes

Figure 4 for Explainable Finite-Memory Policies for Partially Observable Markov Decision Processes

Abstract:Partially Observable Markov Decision Processes (POMDPs) are a fundamental framework for decision-making under uncertainty and partial observability. Since in general optimal policies may require infinite memory, they are hard to implement and often render most problems undecidable. Consequently, finite-memory policies are mostly considered instead. However, the algorithms for computing them are typically very complex, and so are the resulting policies. Facing the need for their explainability, we provide a representation of such policies, both (i) in an interpretable formalism and (ii) typically of smaller size, together yielding higher explainability. To that end, we combine models of Mealy machines and decision trees; the latter describing simple, stationary parts of the policies and the former describing how to switch among them. We design a translation for policies of the finite-state-controller (FSC) form from standard literature and show how our method smoothly generalizes to other variants of finite-memory policies. Further, we identify specific properties of recently used "attractor-based" policies, which allow us to construct yet simpler and smaller representations. Finally, we illustrate the higher explainability in a few case studies.

* Preprint -- Under Review

Via

Access Paper or Ask Questions

1-2-3-Go! Policy Synthesis for Parameterized Markov Decision Processes via Decision-Tree Learning and Generalization

Oct 23, 2024

Muqsit Azeem, Debraj Chakraborty, Sudeep Kanav, Jan Kretinsky, Mohammadsadegh Mohagheghi, Stefanie Mohr, Maximilian Weininger

Abstract:Despite the advances in probabilistic model checking, the scalability of the verification methods remains limited. In particular, the state space often becomes extremely large when instantiating parameterized Markov decision processes (MDPs) even with moderate values. Synthesizing policies for such \emph{huge} MDPs is beyond the reach of available tools. We propose a learning-based approach to obtain a reasonable policy for such huge MDPs. The idea is to generalize optimal policies obtained by model-checking small instances to larger ones using decision-tree learning. Consequently, our method bypasses the need for explicit state-space exploration of large models, providing a practical solution to the state-space explosion problem. We demonstrate the efficacy of our approach by performing extensive experimentation on the relevant models from the quantitative verification benchmark set. The experimental results indicate that our policies perform well, even when the size of the model is orders of magnitude beyond the reach of state-of-the-art analysis tools.

* Preprint. Under review

Via

Access Paper or Ask Questions

Learning Explainable and Better Performing Representations of POMDP Strategies

Jan 20, 2024

Alexander Bork, Debraj Chakraborty, Kush Grover, Jan Kretinsky, Stefanie Mohr

Figure 1 for Learning Explainable and Better Performing Representations of POMDP Strategies

Figure 2 for Learning Explainable and Better Performing Representations of POMDP Strategies

Figure 3 for Learning Explainable and Better Performing Representations of POMDP Strategies

Figure 4 for Learning Explainable and Better Performing Representations of POMDP Strategies

Abstract:Strategies for partially observable Markov decision processes (POMDP) typically require memory. One way to represent this memory is via automata. We present a method to learn an automaton representation of a strategy using a modification of the L*-algorithm. Compared to the tabular representation of a strategy, the resulting automaton is dramatically smaller and thus also more explainable. Moreover, in the learning process, our heuristics may even improve the strategy's performance. In contrast to approaches that synthesize an automaton directly from the POMDP thereby solving it, our approach is incomparably more scalable.

* Technical report for the submission to TACAS 24

Via

Access Paper or Ask Questions

Formally-Sharp DAgger for MCTS: Lower-Latency Monte Carlo Tree Search using Data Aggregation with Formal Methods

Aug 15, 2023

Debraj Chakraborty, Damien Busatto-Gaston, Jean-François Raskin, Guillermo A. Pérez

Figure 1 for Formally-Sharp DAgger for MCTS: Lower-Latency Monte Carlo Tree Search using Data Aggregation with Formal Methods

Figure 2 for Formally-Sharp DAgger for MCTS: Lower-Latency Monte Carlo Tree Search using Data Aggregation with Formal Methods

Figure 3 for Formally-Sharp DAgger for MCTS: Lower-Latency Monte Carlo Tree Search using Data Aggregation with Formal Methods

Figure 4 for Formally-Sharp DAgger for MCTS: Lower-Latency Monte Carlo Tree Search using Data Aggregation with Formal Methods

Abstract:We study how to efficiently combine formal methods, Monte Carlo Tree Search (MCTS), and deep learning in order to produce high-quality receding horizon policies in large Markov Decision processes (MDPs). In particular, we use model-checking techniques to guide the MCTS algorithm in order to generate offline samples of high-quality decisions on a representative set of states of the MDP. Those samples can then be used to train a neural network that imitates the policy used to generate them. This neural network can either be used as a guide on a lower-latency MCTS online search, or alternatively be used as a full-fledged policy when minimal latency is required. We use statistical model checking to detect when additional samples are needed and to focus those additional samples on configurations where the learnt neural network policy differs from the (computationally-expensive) offline policy. We illustrate the use of our method on MDPs that model the Frozen Lake and Pac-Man environments -- two popular benchmarks to evaluate reinforcement-learning algorithms.

Via

Access Paper or Ask Questions

Graphical estimation of multivariate count time series

Feb 17, 2023

Sathish Vurukonda, Debraj Chakraborty, Siuli Mukhopadhyay

Abstract:The problems of selecting partial correlation and causality graphs for count data are considered. A parameter driven generalized linear model is used to describe the observed multivariate time series of counts. Partial correlation and causality graphs corresponding to this model explain the dependencies between each time series of the multivariate count data. In order to estimate these graphs with tunable sparsity, an appropriate likelihood function maximization is regularized with an l1-type constraint. A novel MCEM algorithm is proposed to iteratively solve this regularized MLE. Asymptotic convergence results are proved for the sequence generated by the proposed MCEM algorithm with l1-type regularization. The algorithm is first successfully tested on simulated data. Thereafter, it is applied to observed weekly dengue disease counts from each ward of Greater Mumbai city. The interdependence of various wards in the proliferation of the disease is characterized by the edges of the inferred partial correlation graph. On the other hand, the relative roles of various wards as sources and sinks of dengue spread is quantified by the number and weights of the directed edges originating from and incident upon each ward. From these estimated graphs, it is observed that some special wards act as epicentres of dengue spread even though their disease counts are relatively low.

Via

Access Paper or Ask Questions