Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Quentin Delfosse

Better Decisions through the Right Causal World Model

Apr 09, 2025

Elisabeth Dillies, Quentin Delfosse, Jannis Blüml, Raban Emunds, Florian Peter Busch, Kristian Kersting

Abstract:Reinforcement learning (RL) agents have shown remarkable performances in various environments, where they can discover effective policies directly from sensory inputs. However, these agents often exploit spurious correlations in the training data, resulting in brittle behaviours that fail to generalize to new or slightly modified environments. To address this, we introduce the Causal Object-centric Model Extraction Tool (COMET), a novel algorithm designed to learn the exact interpretable causal world models (CWMs). COMET first extracts object-centric state descriptions from observations and identifies the environment's internal states related to the depicted objects' properties. Using symbolic regression, it models object-centric transitions and derives causal relationships governing object dynamics. COMET further incorporates large language models (LLMs) for semantic inference, annotating causal variables to enhance interpretability. By leveraging these capabilities, COMET constructs CWMs that align with the true causal structure of the environment, enabling agents to focus on task-relevant features. The extracted CWMs mitigate the danger of shortcuts, permitting the development of RL systems capable of better planning and decision-making across dynamic scenarios. Our results, validated in Atari environments such as Pong and Freeway, demonstrate the accuracy and robustness of COMET, highlighting its potential to bridge the gap between object-centric reasoning and causal inference in reinforcement learning.

* 5 pages including references, 2 figures

Via

Access Paper or Ask Questions

Evaluating Interpretable Reinforcement Learning by Distilling Policies into Programs

Mar 11, 2025

Hector Kohler, Quentin Delfosse, Waris Radji, Riad Akrour, Philippe Preux

Abstract:There exist applications of reinforcement learning like medicine where policies need to be ''interpretable'' by humans. User studies have shown that some policy classes might be more interpretable than others. However, it is costly to conduct human studies of policy interpretability. Furthermore, there is no clear definition of policy interpretabiliy, i.e., no clear metrics for interpretability and thus claims depend on the chosen definition. We tackle the problem of empirically evaluating policies interpretability without humans. Despite this lack of clear definition, researchers agree on the notions of ''simulatability'': policy interpretability should relate to how humans understand policy actions given states. To advance research in interpretable reinforcement learning, we contribute a new methodology to evaluate policy interpretability. This new methodology relies on proxies for simulatability that we use to conduct a large-scale empirical evaluation of policy interpretability. We use imitation learning to compute baseline policies by distilling expert neural networks into small programs. We then show that using our methodology to evaluate the baselines interpretability leads to similar conclusions as user studies. We show that increasing interpretability does not necessarily reduce performances and can sometimes increase them. We also show that there is no policy class that better trades off interpretability and performance across tasks making it necessary for researcher to have methodologies for comparing policies interpretability.

* 12 pages of main text, under review

Via

Access Paper or Ask Questions

Interpretable end-to-end Neurosymbolic Reinforcement Learning agents

Oct 18, 2024

Nils Grandien, Quentin Delfosse, Kristian Kersting

Figure 1 for Interpretable end-to-end Neurosymbolic Reinforcement Learning agents

Figure 2 for Interpretable end-to-end Neurosymbolic Reinforcement Learning agents

Figure 3 for Interpretable end-to-end Neurosymbolic Reinforcement Learning agents

Figure 4 for Interpretable end-to-end Neurosymbolic Reinforcement Learning agents

Abstract:Deep reinforcement learning (RL) agents rely on shortcut learning, preventing them from generalizing to slightly different environments. To address this problem, symbolic method, that use object-centric states, have been developed. However, comparing these methods to deep agents is not fair, as these last operate from raw pixel-based states. In this work, we instantiate the symbolic SCoBots framework. SCoBots decompose RL tasks into intermediate, interpretable representations, culminating in action decisions based on a comprehensible set of object-centric relational concepts. This architecture aids in demystifying agent decisions. By explicitly learning to extract object-centric representations from raw states, object-centric RL, and policy distillation via rule extraction, this work places itself within the neurosymbolic AI paradigm, blending the strengths of neural networks with symbolic AI. We present the first implementation of an end-to-end trained SCoBot, separately evaluate of its components, on different Atari games. The results demonstrate the framework's potential to create interpretable and performing RL systems, and pave the way for future research directions in obtaining end-to-end interpretable RL agents.

* 19 pages; 5 figures; 3 tables

Via

Access Paper or Ask Questions

BlendRL: A Framework for Merging Symbolic and Neural Policy Learning

Oct 15, 2024

Hikaru Shindo, Quentin Delfosse, Devendra Singh Dhami, Kristian Kersting

Figure 1 for BlendRL: A Framework for Merging Symbolic and Neural Policy Learning

Figure 2 for BlendRL: A Framework for Merging Symbolic and Neural Policy Learning

Figure 3 for BlendRL: A Framework for Merging Symbolic and Neural Policy Learning

Figure 4 for BlendRL: A Framework for Merging Symbolic and Neural Policy Learning

Abstract:Humans can leverage both symbolic reasoning and intuitive reactions. In contrast, reinforcement learning policies are typically encoded in either opaque systems like neural networks or symbolic systems that rely on predefined symbols and rules. This disjointed approach severely limits the agents' capabilities, as they often lack either the flexible low-level reaction characteristic of neural agents or the interpretable reasoning of symbolic agents. To overcome this challenge, we introduce BlendRL, a neuro-symbolic RL framework that harmoniously integrates both paradigms within RL agents that use mixtures of both logic and neural policies. We empirically demonstrate that BlendRL agents outperform both neural and symbolic baselines in standard Atari environments, and showcase their robustness to environmental changes. Additionally, we analyze the interaction between neural and symbolic policies, illustrating how their hybrid use helps agents overcome each other's limitations.

* Preprint

Via

Access Paper or Ask Questions

OCALM: Object-Centric Assessment with Language Models

Jun 24, 2024

Timo Kaufmann, Jannis Blüml, Antonia Wüst, Quentin Delfosse, Kristian Kersting, Eyke Hüllermeier

Figure 1 for OCALM: Object-Centric Assessment with Language Models

Figure 2 for OCALM: Object-Centric Assessment with Language Models

Figure 3 for OCALM: Object-Centric Assessment with Language Models

Figure 4 for OCALM: Object-Centric Assessment with Language Models

Abstract:Properly defining a reward signal to efficiently train a reinforcement learning (RL) agent is a challenging task. Designing balanced objective functions from which a desired behavior can emerge requires expert knowledge, especially for complex environments. Learning rewards from human feedback or using large language models (LLMs) to directly provide rewards are promising alternatives, allowing non-experts to specify goals for the agent. However, black-box reward models make it difficult to debug the reward. In this work, we propose Object-Centric Assessment with Language Models (OCALM) to derive inherently interpretable reward functions for RL agents from natural language task descriptions. OCALM uses the extensive world-knowledge of LLMs while leveraging the object-centric nature common to many environments to derive reward functions focused on relational concepts, providing RL agents with the ability to derive policies from task descriptions.

* Accepted at the RLBRew Workshop at RLC 2024

Via

Access Paper or Ask Questions

EXPIL: Explanatory Predicate Invention for Learning in Games

Jun 10, 2024

Jingyuan Sha, Hikaru Shindo, Quentin Delfosse, Kristian Kersting, Devendra Singh Dhami

Abstract:Reinforcement learning (RL) has proven to be a powerful tool for training agents that excel in various games. However, the black-box nature of neural network models often hinders our ability to understand the reasoning behind the agent's actions. Recent research has attempted to address this issue by using the guidance of pretrained neural agents to encode logic-based policies, allowing for interpretable decisions. A drawback of such approaches is the requirement of large amounts of predefined background knowledge in the form of predicates, limiting its applicability and scalability. In this work, we propose a novel approach, Explanatory Predicate Invention for Learning in Games (EXPIL), that identifies and extracts predicates from a pretrained neural agent, later used in the logic-based agents, reducing the dependency on predefined background knowledge. Our experimental evaluation on various games demonstrate the effectiveness of EXPIL in achieving explainable behavior in logic agents while requiring less background knowledge.

* 9 pages, 2 pages references, 8 figures, 3 tables

Via

Access Paper or Ask Questions

HackAtari: Atari Learning Environments for Robust and Continual Reinforcement Learning

Jun 06, 2024

Quentin Delfosse, Jannis Blüml, Bjarne Gregori, Kristian Kersting

Figure 1 for HackAtari: Atari Learning Environments for Robust and Continual Reinforcement Learning

Figure 2 for HackAtari: Atari Learning Environments for Robust and Continual Reinforcement Learning

Figure 3 for HackAtari: Atari Learning Environments for Robust and Continual Reinforcement Learning

Figure 4 for HackAtari: Atari Learning Environments for Robust and Continual Reinforcement Learning

Abstract:Artificial agents' adaptability to novelty and alignment with intended behavior is crucial for their effective deployment. Reinforcement learning (RL) leverages novelty as a means of exploration, yet agents often struggle to handle novel situations, hindering generalization. To address these issues, we propose HackAtari, a framework introducing controlled novelty to the most common RL benchmark, the Atari Learning Environment. HackAtari allows us to create novel game scenarios (including simplification for curriculum learning), to swap the game elements' colors, as well as to introduce different reward signals for the agent. We demonstrate that current agents trained on the original environments include robustness failures, and evaluate HackAtari's efficacy in enhancing RL agents' robustness and aligning behavior through experiments using C51 and PPO. Overall, HackAtari can be used to improve the robustness of current and future RL algorithms, allowing Neuro-Symbolic RL, curriculum RL, causal RL, as well as LLM-driven RL. Our work underscores the significance of developing interpretable in RL agents.

* 9 main pages, 4 pages references, 19 pages of appendix

Via

Access Paper or Ask Questions

Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning

May 23, 2024

Hector Kohler, Quentin Delfosse, Riad Akrour, Kristian Kersting, Philippe Preux

Figure 1 for Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning

Figure 2 for Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning

Figure 3 for Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning

Figure 4 for Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning

Abstract:Deep reinforcement learning agents are prone to goal misalignments. The black-box nature of their policies hinders the detection and correction of such misalignments, and the trust necessary for real-world deployment. So far, solutions learning interpretable policies are inefficient or require many human priors. We propose INTERPRETER, a fast distillation method producing INTerpretable Editable tRee Programs for ReinforcEmenT lEaRning. We empirically demonstrate that INTERPRETER compact tree programs match oracles across a diverse set of sequential decision tasks and evaluate the impact of our design choices on interpretability and performances. We show that our policies can be interpreted and edited to correct misalignments on Atari games and to explain real farming strategies.

Via

Access Paper or Ask Questions

Towards a Research Community in Interpretable Reinforcement Learning: the InterpPol Workshop

Apr 16, 2024

Hector Kohler, Quentin Delfosse, Paul Festor, Philippe Preux

Figure 1 for Towards a Research Community in Interpretable Reinforcement Learning: the InterpPol Workshop

Abstract:Embracing the pursuit of intrinsically explainable reinforcement learning raises crucial questions: what distinguishes explainability from interpretability? Should explainable and interpretable agents be developed outside of domains where transparency is imperative? What advantages do interpretable policies offer over neural networks? How can we rigorously define and measure interpretability in policies, without user studies? What reinforcement learning paradigms,are the most suited to develop interpretable agents? Can Markov Decision Processes integrate interpretable state representations? In addition to motivate an Interpretable RL community centered around the aforementioned questions, we propose the first venue dedicated to Interpretable RL: the InterpPol Workshop.

Via

Access Paper or Ask Questions

Pix2Code: Learning to Compose Neural Visual Concepts as Programs

Feb 13, 2024

Antonia Wüst, Wolfgang Stammer, Quentin Delfosse, Devendra Singh Dhami, Kristian Kersting

Figure 1 for Pix2Code: Learning to Compose Neural Visual Concepts as Programs

Figure 2 for Pix2Code: Learning to Compose Neural Visual Concepts as Programs

Figure 3 for Pix2Code: Learning to Compose Neural Visual Concepts as Programs

Figure 4 for Pix2Code: Learning to Compose Neural Visual Concepts as Programs

Abstract:The challenge in learning abstract concepts from images in an unsupervised fashion lies in the required integration of visual perception and generalizable relational reasoning. Moreover, the unsupervised nature of this task makes it necessary for human users to be able to understand a model's learnt concepts and potentially revise false behaviours. To tackle both the generalizability and interpretability constraints of visual concept learning, we propose Pix2Code, a framework that extends program synthesis to visual relational reasoning by utilizing the abilities of both explicit, compositional symbolic and implicit neural representations. This is achieved by retrieving object representations from images and synthesizing relational concepts as lambda-calculus programs. We evaluate the diverse properties of Pix2Code on the challenging reasoning domains, Kandinsky Patterns and CURI, thereby testing its ability to identify compositional visual concepts that generalize to novel data and concept configurations. Particularly, in stark contrast to neural approaches, we show that Pix2Code's representations remain human interpretable and can be easily revised for improved performance.

Via

Access Paper or Ask Questions