Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Laurenz Wiskott

Putting the Iterative Training of Decision Trees to the Test on a Real-World Robotic Task

Dec 06, 2024

Raphael C. Engelhardt, Marcel J. Meinen, Moritz Lange, Laurenz Wiskott, Wolfgang Konen

Figure 1 for Putting the Iterative Training of Decision Trees to the Test on a Real-World Robotic Task

Figure 2 for Putting the Iterative Training of Decision Trees to the Test on a Real-World Robotic Task

Figure 3 for Putting the Iterative Training of Decision Trees to the Test on a Real-World Robotic Task

Figure 4 for Putting the Iterative Training of Decision Trees to the Test on a Real-World Robotic Task

Abstract:In previous research, we developed methods to train decision trees (DT) as agents for reinforcement learning tasks, based on deep reinforcement learning (DRL) networks. The samples from which the DTs are built, use the environment's state as features and the corresponding action as label. To solve the nontrivial task of selecting samples, which on one hand reflect the DRL agent's capabilities of choosing the right action but on the other hand also cover enough state space to generalize well, we developed an algorithm to iteratively train DTs. In this short paper, we apply this algorithm to a real-world implementation of a robotic task for the first time. Real-world tasks pose additional challenges compared to simulations, such as noise and delays. The task consists of a physical pendulum attached to a cart, which moves on a linear track. By movements to the left and to the right, the pendulum is to be swung in the upright position and balanced in the unstable equilibrium. Our results demonstrate the applicability of the algorithm to real-world tasks by generating a DT whose performance matches the performance of the DRL agent, while consisting of fewer parameters. This research could be a starting point for distilling DTs from DRL agents to obtain transparent, lightweight models for real-world reinforcement learning tasks.

* 5 pages, 4 figures

Via

Access Paper or Ask Questions

What is the relationship between Slow Feature Analysis and the Successor Representation?

Sep 25, 2024

Eddie Seabrook, Laurenz Wiskott

Figure 1 for What is the relationship between Slow Feature Analysis and the Successor Representation?

Figure 2 for What is the relationship between Slow Feature Analysis and the Successor Representation?

Figure 3 for What is the relationship between Slow Feature Analysis and the Successor Representation?

Figure 4 for What is the relationship between Slow Feature Analysis and the Successor Representation?

Abstract:(This is a work in progress. Feedback is welcome) An analytical comparison is made between slow feature analysis (SFA) and the successor representation (SR). While SFA and the SR stem from distinct areas of machine learning, they share important properties, both in terms of their mathematics and the types of information they are sensitive to. This work studies their connection along these two axes. In particular, multiple variants of the SFA algorithm are explored analytically and then applied to the setting of an MDP, leading to a family of eigenvalue problems involving the SR and other related quantities. These resulting eigenvalue problems are then illustrated in the toy setting of a gridworld, where it is demonstrated that the place- and grid-like fields often associated to the SR can equally be generated using SFA.

* 52 pages, 5 figures

Via

Access Paper or Ask Questions

Exploring the limits of Hierarchical World Models in Reinforcement Learning

Jun 01, 2024

Robin Schiewer, Anand Subramoney, Laurenz Wiskott

Abstract:Hierarchical model-based reinforcement learning (HMBRL) aims to combine the benefits of better sample efficiency of model based reinforcement learning (MBRL) with the abstraction capability of hierarchical reinforcement learning (HRL) to solve complex tasks efficiently. While HMBRL has great potential, it still lacks wide adoption. In this work we describe a novel HMBRL framework and evaluate it thoroughly. To complement the multi-layered decision making idiom characteristic for HRL, we construct hierarchical world models that simulate environment dynamics at various levels of temporal abstraction. These models are used to train a stack of agents that communicate in a top-down manner by proposing goals to their subordinate agents. A significant focus of this study is the exploration of a static and environment agnostic temporal abstraction, which allows concurrent training of models and agents throughout the hierarchy. Unlike most goal-conditioned H(MB)RL approaches, it also leads to comparatively low dimensional abstract actions. Although our HMBRL approach did not outperform traditional methods in terms of final episode returns, it successfully facilitated decision making across two levels of abstraction using compact, low dimensional abstract actions. A central challenge in enhancing our method's performance, as uncovered through comprehensive experimentation, is model exploitation on the abstract level of our world model stack. We provide an in depth examination of this issue, discussing its implications for the field and suggesting directions for future research to overcome this challenge. By sharing these findings, we aim to contribute to the broader discourse on refining HMBRL methodologies and to assist in the development of more effective autonomous learning systems for complex decision-making environments.

* 26 pages, 14 figures

Via

Access Paper or Ask Questions

ProtoP-OD: Explainable Object Detection with Prototypical Parts

Feb 29, 2024

Pavlos Rath-Manakidis, Frederik Strothmann, Tobias Glasmachers, Laurenz Wiskott

Figure 1 for ProtoP-OD: Explainable Object Detection with Prototypical Parts

Figure 2 for ProtoP-OD: Explainable Object Detection with Prototypical Parts

Figure 3 for ProtoP-OD: Explainable Object Detection with Prototypical Parts

Figure 4 for ProtoP-OD: Explainable Object Detection with Prototypical Parts

Abstract:Interpretation and visualization of the behavior of detection transformers tends to highlight the locations in the image that the model attends to, but it provides limited insight into the \emph{semantics} that the model is focusing on. This paper introduces an extension to detection transformers that constructs prototypical local features and uses them in object detection. These custom features, which we call prototypical parts, are designed to be mutually exclusive and align with the classifications of the model. The proposed extension consists of a bottleneck module, the prototype neck, that computes a discretized representation of prototype activations and a new loss term that matches prototypes to object classes. This setup leads to interpretable representations in the prototype neck, allowing visual inspection of the image content perceived by the model and a better understanding of the model's reliability. We show experimentally that our method incurs only a limited performance penalty, and we provide examples that demonstrate the quality of the explanations provided by our method, which we argue outweighs the performance penalty.

* 9 pages, 11 figures

Via

Access Paper or Ask Questions

Interpretable Brain-Inspired Representations Improve RL Performance on Visual Navigation Tasks

Feb 19, 2024

Moritz Lange, Raphael C. Engelhardt, Wolfgang Konen, Laurenz Wiskott

Figure 1 for Interpretable Brain-Inspired Representations Improve RL Performance on Visual Navigation Tasks

Figure 2 for Interpretable Brain-Inspired Representations Improve RL Performance on Visual Navigation Tasks

Figure 3 for Interpretable Brain-Inspired Representations Improve RL Performance on Visual Navigation Tasks

Figure 4 for Interpretable Brain-Inspired Representations Improve RL Performance on Visual Navigation Tasks

Abstract:Visual navigation requires a whole range of capabilities. A crucial one of these is the ability of an agent to determine its own location and heading in an environment. Prior works commonly assume this information as given, or use methods which lack a suitable inductive bias and accumulate error over time. In this work, we show how the method of slow feature analysis (SFA), inspired by neuroscience research, overcomes both limitations by generating interpretable representations of visual data that encode location and heading of an agent. We employ SFA in a modern reinforcement learning context, analyse and compare representations and illustrate where hierarchical SFA can outperform other feature extractors on navigation tasks.

* Accepted at XAI4DRL workshop at AAAI 2024

Via

Access Paper or Ask Questions

Classification and Reconstruction Processes in Deep Predictive Coding Networks: Antagonists or Allies?

Jan 17, 2024

Jan Rathjens, Laurenz Wiskott

Abstract:Predictive coding-inspired deep networks for visual computing integrate classification and reconstruction processes in shared intermediate layers. Although synergy between these processes is commonly assumed, it has yet to be convincingly demonstrated. In this study, we take a critical look at how classifying and reconstructing interact in deep learning architectures. Our approach utilizes a purposefully designed family of model architectures reminiscent of autoencoders, each equipped with an encoder, a decoder, and a classification head featuring varying modules and complexities. We meticulously analyze the extent to which classification- and reconstruction-driven information can seamlessly coexist within the shared latent layer of the model architectures. Our findings underscore a significant challenge: Classification-driven information diminishes reconstruction-driven information in intermediate layers' shared representations and vice versa. While expanding the shared representation's dimensions or increasing the network's complexity can alleviate this trade-off effect, our results challenge prevailing assumptions in predictive coding and offer guidance for future iterations of predictive coding concepts in deep networks.

Via

Access Paper or Ask Questions

Improving Reinforcement Learning Efficiency with Auxiliary Tasks in Non-Visual Environments: A Comparison

Oct 09, 2023

Moritz Lange, Noah Krystiniak, Raphael C. Engelhardt, Wolfgang Konen, Laurenz Wiskott

Abstract:Real-world reinforcement learning (RL) environments, whether in robotics or industrial settings, often involve non-visual observations and require not only efficient but also reliable and thus interpretable and flexible RL approaches. To improve efficiency, agents that perform state representation learning with auxiliary tasks have been widely studied in visual observation contexts. However, for real-world problems, dedicated representation learning modules that are decoupled from RL agents are more suited to meet requirements. This study compares common auxiliary tasks based on, to the best of our knowledge, the only decoupled representation learning method for low-dimensional non-visual observations. We evaluate potential improvements in sample efficiency and returns for environments ranging from a simple pendulum to a complex simulated robotics task. Our findings show that representation learning with auxiliary tasks only provides performance gains in sufficiently complex environments and that learning environment dynamics is preferable to predicting rewards. These insights can inform future development of interpretable representation learning approaches for non-visual observations and advance the use of RL solutions in real-world scenarios.

* Accepted at LOD 2023

Via

Access Paper or Ask Questions

A Tutorial on the Spectral Theory of Markov Chains

Jul 05, 2022

Eddie Seabrook, Laurenz Wiskott

Figure 1 for A Tutorial on the Spectral Theory of Markov Chains

Figure 2 for A Tutorial on the Spectral Theory of Markov Chains

Figure 3 for A Tutorial on the Spectral Theory of Markov Chains

Figure 4 for A Tutorial on the Spectral Theory of Markov Chains

Abstract:Markov chains are a class of probabilistic models that have achieved widespread application in the quantitative sciences. This is in part due to their versatility, but is compounded by the ease with which they can be probed analytically. This tutorial provides an in-depth introduction to Markov chains, and explores their connection to graphs and random walks. We utilize tools from linear algebra and graph theory to describe the transition matrices of different types of Markov chains, with a particular focus on exploring properties of the eigenvalues and eigenvectors corresponding to these matrices. The results presented are relevant to a number of methods in machine learning and data mining, which we describe at various stages. Rather than being a novel academic study in its own right, this text presents a collection of known results, together with some new concepts. Moreover, the tutorial focuses on offering intuition to readers rather than formal understanding, and only assumes basic exposure to concepts from linear algebra and probability theory. It is therefore accessible to students and researchers from a wide variety of disciplines.

Via

Access Paper or Ask Questions

A model of semantic completion in generative episodic memory

Nov 26, 2021

Zahra Fayyaz, Aya Altamimi, Sen Cheng, Laurenz Wiskott

Figure 1 for A model of semantic completion in generative episodic memory

Figure 2 for A model of semantic completion in generative episodic memory

Figure 3 for A model of semantic completion in generative episodic memory

Figure 4 for A model of semantic completion in generative episodic memory

Abstract:Many different studies have suggested that episodic memory is a generative process, but most computational models adopt a storage view. In this work, we propose a computational model for generative episodic memory. It is based on the central hypothesis that the hippocampus stores and retrieves selected aspects of an episode as a memory trace, which is necessarily incomplete. At recall, the neocortex reasonably fills in the missing information based on general semantic information in a process we call semantic completion. As episodes we use images of digits (MNIST) augmented by different backgrounds representing context. Our model is based on a VQ-VAE which generates a compressed latent representation in form of an index matrix, which still has some spatial resolution. We assume that attention selects some part of the index matrix while others are discarded, this then represents the gist of the episode and is stored as a memory trace. At recall the missing parts are filled in by a PixelCNN, modeling semantic completion, and the completed index matrix is then decoded into a full image by the VQ-VAE. The model is able to complete missing parts of a memory trace in a semantically plausible way up to the point where it can generate plausible images from scratch. Due to the combinatorics in the index matrix, the model generalizes well to images not trained on. Compression as well as semantic completion contribute to a strong reduction in memory requirements and robustness to noise. Finally we also model an episodic memory experiment and can reproduce that semantically congruent contexts are always recalled better than incongruent ones, high attention levels improve memory accuracy in both cases, and contexts that are not remembered correctly are more often remembered semantically congruently than completely wrong.

* 15 pages, 9 figures, 58 references

Via

Access Paper or Ask Questions

Modular Networks Prevent Catastrophic Interference in Model-Based Multi-Task Reinforcement Learning

Nov 15, 2021

Robin Schiewer, Laurenz Wiskott

Figure 1 for Modular Networks Prevent Catastrophic Interference in Model-Based Multi-Task Reinforcement Learning

Figure 2 for Modular Networks Prevent Catastrophic Interference in Model-Based Multi-Task Reinforcement Learning

Figure 3 for Modular Networks Prevent Catastrophic Interference in Model-Based Multi-Task Reinforcement Learning

Figure 4 for Modular Networks Prevent Catastrophic Interference in Model-Based Multi-Task Reinforcement Learning

Abstract:In a multi-task reinforcement learning setting, the learner commonly benefits from training on multiple related tasks by exploiting similarities among them. At the same time, the trained agent is able to solve a wider range of different problems. While this effect is well documented for model-free multi-task methods, we demonstrate a detrimental effect when using a single learned dynamics model for multiple tasks. Thus, we address the fundamental question of whether model-based multi-task reinforcement learning benefits from shared dynamics models in a similar way model-free methods do from shared policy networks. Using a single dynamics model, we see clear evidence of task confusion and reduced performance. As a remedy, enforcing an internal structure for the learned dynamics model by training isolated sub-networks for each task notably improves performance while using the same amount of parameters. We illustrate our findings by comparing both methods on a simple gridworld and a more complex vizdoom multi-task experiment.

* 15 pages, preprint of a paper presented at the LOD 2021

Via

Access Paper or Ask Questions