Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Erik Schaffernicht

Beyond Predefined Actions: Integrating Behavior Trees and Dynamic Movement Primitives for Robot Learning from Demonstration

May 13, 2025

David Cáceres Domínguez, Erik Schaffernicht, Todor Stoyanov

Abstract:Interpretable policy representations like Behavior Trees (BTs) and Dynamic Motion Primitives (DMPs) enable robot skill transfer from human demonstrations, but each faces limitations: BTs require expert-crafted low-level actions, while DMPs lack high-level task logic. We address these limitations by integrating DMP controllers into a BT framework, jointly learning the BT structure and DMP actions from single demonstrations, thereby removing the need for predefined actions. Additionally, by combining BT decision logic with DMP motion generation, our method enhances policy interpretability, modularity, and adaptability for autonomous systems. Our approach readily affords both learning to replicate low-level motions and combining partial demonstrations into a coherent and easy-to-modify overall policy.

* 14 pages, 6 figures, accepted (not yet published) at IAS19 2025 conference

Via

Access Paper or Ask Questions

On the Fly Adaptation of Behavior Tree-Based Policies through Reinforcement Learning

Mar 08, 2025

Marco Iannotta, Johannes A. Stork, Erik Schaffernicht, Todor Stoyanov

Abstract:With the rising demand for flexible manufacturing, robots are increasingly expected to operate in dynamic environments where local -- such as slight offsets or size differences in workpieces -- are common. We propose to address the problem of adapting robot behaviors to these task variations with a sample-efficient hierarchical reinforcement learning approach adapting Behavior Tree (BT)-based policies. We maintain the core BT properties as an interpretable, modular framework for structuring reactive behaviors, but extend their use beyond static tasks by inherently accommodating local task variations. To show the efficiency and effectiveness of our approach, we conduct experiments both in simulation and on a Franka Emika Panda 7-DoF, with the manipulator adapting to different obstacle avoidance and pivoting tasks.

Via

Access Paper or Ask Questions

On the Effects of Irrelevant Variables in Treatment Effect Estimation with Deep Disentanglement

Jul 29, 2024

Ahmad Saeed Khan, Erik Schaffernicht, Johannes Andreas Stork

Abstract:Estimating treatment effects from observational data is paramount in healthcare, education, and economics, but current deep disentanglement-based methods to address selection bias are insufficiently handling irrelevant variables. We demonstrate in experiments that this leads to prediction errors. We disentangle pre-treatment variables with a deep embedding method and explicitly identify and represent irrelevant variables, additionally to instrumental, confounding and adjustment latent factors. To this end, we introduce a reconstruction objective and create an embedding space for irrelevant variables using an attached autoencoder. Instead of relying on serendipitous suppression of irrelevant variables as in previous deep disentanglement approaches, we explicitly force irrelevant variables into this embedding space and employ orthogonalization to prevent irrelevant information from leaking into the latent space representations of the other factors. Our experiments with synthetic and real-world benchmark datasets show that we can better identify irrelevant variables and more precisely predict treatment effects than previous methods, while prediction quality degrades less when additional irrelevant variables are introduced.

* Paper is accepted at ECAI-2024

Via

Access Paper or Ask Questions

Learning Solutions of Stochastic Optimization Problems with Bayesian Neural Networks

Jun 05, 2024

Alan A. Lahoud, Erik Schaffernicht, Johannes A. Stork

Abstract:Mathematical solvers use parametrized Optimization Problems (OPs) as inputs to yield optimal decisions. In many real-world settings, some of these parameters are unknown or uncertain. Recent research focuses on predicting the value of these unknown parameters using available contextual features, aiming to decrease decision regret by adopting end-to-end learning approaches. However, these approaches disregard prediction uncertainty and therefore make the mathematical solver susceptible to provide erroneous decisions in case of low-confidence predictions. We propose a novel framework that models prediction uncertainty with Bayesian Neural Networks (BNNs) and propagates this uncertainty into the mathematical solver with a Stochastic Programming technique. The differentiable nature of BNNs and differentiable mathematical solvers allow for two different learning approaches: In the Decoupled learning approach, we update the BNN weights to increase the quality of the predictions' distribution of the OP parameters, while in the Combined learning approach, we update the weights aiming to directly minimize the expected OP's cost function in a stochastic end-to-end fashion. We do an extensive evaluation using synthetic data with various noise properties and a real dataset, showing that decisions regret are generally lower (better) with both proposed methods.

Via

Access Paper or Ask Questions

DataSP: A Differential All-to-All Shortest Path Algorithm for Learning Costs and Predicting Paths with Context

May 08, 2024

Alan A. Lahoud, Erik Schaffernicht, Johannes A. Stork

Figure 1 for DataSP: A Differential All-to-All Shortest Path Algorithm for Learning Costs and Predicting Paths with Context

Figure 2 for DataSP: A Differential All-to-All Shortest Path Algorithm for Learning Costs and Predicting Paths with Context

Figure 3 for DataSP: A Differential All-to-All Shortest Path Algorithm for Learning Costs and Predicting Paths with Context

Figure 4 for DataSP: A Differential All-to-All Shortest Path Algorithm for Learning Costs and Predicting Paths with Context

Abstract:Learning latent costs of transitions on graphs from trajectories demonstrations under various contextual features is challenging but useful for path planning. Yet, existing methods either oversimplify cost assumptions or scale poorly with the number of observed trajectories. This paper introduces DataSP, a differentiable all-to-all shortest path algorithm to facilitate learning latent costs from trajectories. It allows to learn from a large number of trajectories in each learning step without additional computation. Complex latent cost functions from contextual features can be represented in the algorithm through a neural network approximation. We further propose a method to sample paths from DataSP in order to reconstruct/mimic observed paths' distributions. We prove that the inferred distribution follows the maximum entropy principle. We show that DataSP outperforms state-of-the-art differentiable combinatorial solver and classical machine learning approaches in predicting paths on graphs.

Via

Access Paper or Ask Questions

LaCE-LHMP: Airflow Modelling-Inspired Long-Term Human Motion Prediction By Enhancing Laminar Characteristics in Human Flow

Mar 20, 2024

Yufei Zhu, Han Fan, Andrey Rudenko, Martin Magnusson, Erik Schaffernicht, Achim J. Lilienthal

Figure 1 for LaCE-LHMP: Airflow Modelling-Inspired Long-Term Human Motion Prediction By Enhancing Laminar Characteristics in Human Flow

Figure 2 for LaCE-LHMP: Airflow Modelling-Inspired Long-Term Human Motion Prediction By Enhancing Laminar Characteristics in Human Flow

Figure 3 for LaCE-LHMP: Airflow Modelling-Inspired Long-Term Human Motion Prediction By Enhancing Laminar Characteristics in Human Flow

Figure 4 for LaCE-LHMP: Airflow Modelling-Inspired Long-Term Human Motion Prediction By Enhancing Laminar Characteristics in Human Flow

Abstract:Long-term human motion prediction (LHMP) is essential for safely operating autonomous robots and vehicles in populated environments. It is fundamental for various applications, including motion planning, tracking, human-robot interaction and safety monitoring. However, accurate prediction of human trajectories is challenging due to complex factors, including, for example, social norms and environmental conditions. The influence of such factors can be captured through Maps of Dynamics (MoDs), which encode spatial motion patterns learned from (possibly scattered and partial) past observations of motion in the environment and which can be used for data-efficient, interpretable motion prediction (MoD-LHMP). To address the limitations of prior work, especially regarding accuracy and sensitivity to anomalies in long-term prediction, we propose the Laminar Component Enhanced LHMP approach (LaCE-LHMP). Our approach is inspired by data-driven airflow modelling, which estimates laminar and turbulent flow components and uses predominantly the laminar components to make flow predictions. Based on the hypothesis that human trajectory patterns also manifest laminar flow (that represents predictable motion) and turbulent flow components (that reflect more unpredictable and arbitrary motion), LaCE-LHMP extracts the laminar patterns in human dynamics and uses them for human motion prediction. We demonstrate the superior prediction performance of LaCE-LHMP through benchmark comparisons with state-of-the-art LHMP methods, offering an unconventional perspective and a more intuitive understanding of human movement patterns.

* Accepted to the 2024 IEEE International Conference on Robotics and Automation (ICRA)

Via

Access Paper or Ask Questions

Prioritized Soft Q-Decomposition for Lexicographic Reinforcement Learning

Oct 03, 2023

Finn Rietz, Stefan Heinrich, Erik Schaffernicht, Johannes Andreas Stork

Abstract:Reinforcement learning (RL) for complex tasks remains a challenge, primarily due to the difficulties of engineering scalar reward functions and the inherent inefficiency of training models from scratch. Instead, it would be better to specify complex tasks in terms of elementary subtasks and to reuse subtask solutions whenever possible. In this work, we address continuous space lexicographic multi-objective RL problems, consisting of prioritized subtasks, which are notoriously difficult to solve. We show that these can be scalarized with a subtask transformation and then solved incrementally using value decomposition. Exploiting this insight, we propose prioritized soft Q-decomposition (PSQD), a novel algorithm for learning and adapting subtask solutions under lexicographic priorities in continuous state-action spaces. PSQD offers the ability to reuse previously learned subtask solutions in a zero-shot composition, followed by an adaptation step. Its ability to use retained subtask training data for offline learning eliminates the need for new environment interaction during adaptation. We demonstrate the efficacy of our approach by presenting successful learning, reuse, and adaptation results for both low- and high-dimensional simulated robot control tasks, as well as offline learning results. In contrast to baseline approaches, PSQD does not trade off between conflicting subtasks or priority constraints and satisfies subtask priorities during learning. PSQD provides an intuitive framework for tackling complex RL problems, offering insights into the inner workings of the subtask composition.

Via

Access Paper or Ask Questions

Heterogeneous Full-body Control of a Mobile Manipulator with Behavior Trees

Oct 16, 2022

Marco Iannotta, David Cáceres Domínguez, Johannes A. Stork, Erik Schaffernicht, Todor Stoyanov

Figure 1 for Heterogeneous Full-body Control of a Mobile Manipulator with Behavior Trees

Figure 2 for Heterogeneous Full-body Control of a Mobile Manipulator with Behavior Trees

Figure 3 for Heterogeneous Full-body Control of a Mobile Manipulator with Behavior Trees

Figure 4 for Heterogeneous Full-body Control of a Mobile Manipulator with Behavior Trees

Abstract:Integrating the heterogeneous controllers of a complex mechanical system, such as a mobile manipulator, within the same structure and in a modular way is still challenging. In this work we extend our framework based on Behavior Trees for the control of a redundant mechanical system to the problem of commanding more complex systems that involve multiple low-level controllers. This allows the integrated systems to achieve non-trivial goals that require coordination among the sub-systems.

* arXiv admin note: substantial text overlap with arXiv:2209.08619

Via

Access Paper or Ask Questions

Towards Task-Prioritized Policy Composition

Sep 20, 2022

Finn Rietz, Erik Schaffernicht, Todor Stoyanov, Johannes A. Stork

Figure 1 for Towards Task-Prioritized Policy Composition

Figure 2 for Towards Task-Prioritized Policy Composition

Abstract:Combining learned policies in a prioritized, ordered manner is desirable because it allows for modular design and facilitates data reuse through knowledge transfer. In control theory, prioritized composition is realized by null-space control, where low-priority control actions are projected into the null-space of high-priority control actions. Such a method is currently unavailable for Reinforcement Learning. We propose a novel, task-prioritized composition framework for Reinforcement Learning, which involves a novel concept: The indifferent-space of Reinforcement Learning policies. Our framework has the potential to facilitate knowledge transfer and modular design while greatly increasing data efficiency and data reuse for Reinforcement Learning agents. Further, our approach can ensure high-priority constraint satisfaction, which makes it promising for learning in safety-critical domains like robotics. Unlike null-space control, our approach allows learning globally optimal policies for the compound task by online learning in the indifference-space of higher-level policies after initial compound policy construction.

Via

Access Paper or Ask Questions

A Stack-of-Tasks Approach Combined with Behavior Trees: a New Framework for Robot Control

Sep 18, 2022

David Cáceres Domínguez, Marco Iannotta, Johannes A. Stork, Erik Schaffernicht, Todor Stoyanov

Figure 1 for A Stack-of-Tasks Approach Combined with Behavior Trees: a New Framework for Robot Control

Figure 2 for A Stack-of-Tasks Approach Combined with Behavior Trees: a New Framework for Robot Control

Figure 3 for A Stack-of-Tasks Approach Combined with Behavior Trees: a New Framework for Robot Control

Figure 4 for A Stack-of-Tasks Approach Combined with Behavior Trees: a New Framework for Robot Control

Abstract:Stack-of-Tasks (SoT) control allows a robot to simultaneously fulfill a number of prioritized goals formulated in terms of (in)equality constraints in error space. Since this approach solves a sequence of Quadratic Programs (QP) at each time-step, without taking into account any temporal state evolution, it is suitable for dealing with local disturbances. However, its limitation lies in the handling of situations that require non-quadratic objectives to achieve a specific goal, as well as situations where countering the control disturbance would require a locally suboptimal action. Recent works address this shortcoming by exploiting Finite State Machines (FSMs) to compose the tasks in such a way that the robot does not get stuck in local minima. Nevertheless, the intrinsic trade-off between reactivity and modularity that characterizes FSMs makes them impractical for defining reactive behaviors in dynamic environments. In this letter, we combine the SoT control strategy with Behavior Trees (BTs), a task switching structure that addresses some of the limitations of the FSMs in terms of reactivity, modularity and re-usability. Experimental results on a Franka Emika Panda 7-DOF manipulator show the robustness of our framework, that allows the robot to benefit from the reactivity of both SoT and BTs.

Via

Access Paper or Ask Questions