Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kevin Sebastian Luck

MoDeSuite: Robot Learning Task Suite for Benchmarking Mobile Manipulation with Deformable Objects

Jul 29, 2025

Yuying Zhang, Kevin Sebastian Luck, Francesco Verdoja, Ville Kyrki, Joni Pajarinen

Abstract:Mobile manipulation is a critical capability for robots operating in diverse, real-world environments. However, manipulating deformable objects and materials remains a major challenge for existing robot learning algorithms. While various benchmarks have been proposed to evaluate manipulation strategies with rigid objects, there is still a notable lack of standardized benchmarks that address mobile manipulation tasks involving deformable objects. To address this gap, we introduce MoDeSuite, the first Mobile Manipulation Deformable Object task suite, designed specifically for robot learning. MoDeSuite consists of eight distinct mobile manipulation tasks covering both elastic objects and deformable objects, each presenting a unique challenge inspired by real-world robot applications. Success in these tasks requires effective collaboration between the robot's base and manipulator, as well as the ability to exploit the deformability of the objects. To evaluate and demonstrate the use of the proposed benchmark, we train two state-of-the-art reinforcement learning algorithms and two imitation learning algorithms, highlighting the difficulties encountered and showing their performance in simulation. Furthermore, we demonstrate the practical relevance of the suite by deploying the trained policies directly into the real world with the Spot robot, showcasing the potential for sim-to-real transfer. We expect that MoDeSuite will open a novel research domain in mobile manipulation involving deformable objects. Find more details, code, and videos at https://sites.google.com/view/modesuite/home.

Via

Access Paper or Ask Questions

Learning Transparent Reward Models via Unsupervised Feature Selection

Oct 24, 2024

Daulet Baimukashev, Gokhan Alcan, Kevin Sebastian Luck, Ville Kyrki

Abstract:In complex real-world tasks such as robotic manipulation and autonomous driving, collecting expert demonstrations is often more straightforward than specifying precise learning objectives and task descriptions. Learning from expert data can be achieved through behavioral cloning or by learning a reward function, i.e., inverse reinforcement learning. The latter allows for training with additional data outside the training distribution, guided by the inferred reward function. We propose a novel approach to construct compact and transparent reward models from automatically selected state features. These inferred rewards have an explicit form and enable the learning of policies that closely match expert behavior by training standard reinforcement learning algorithms from scratch. We validate our method's performance in various robotic environments with continuous and high-dimensional state spaces. Webpage: \url{https://sites.google.com/view/transparent-reward}.

* 8 pages, 4 figures

Via

Access Paper or Ask Questions

Practical Equivariances via Relational Conditional Neural Processes

Jun 19, 2023

Daolang Huang, Manuel Haussmann, Ulpu Remes, ST John, Grégoire Clarté, Kevin Sebastian Luck, Samuel Kaski, Luigi Acerbi

Figure 1 for Practical Equivariances via Relational Conditional Neural Processes

Figure 2 for Practical Equivariances via Relational Conditional Neural Processes

Figure 3 for Practical Equivariances via Relational Conditional Neural Processes

Figure 4 for Practical Equivariances via Relational Conditional Neural Processes

Abstract:Conditional Neural Processes (CNPs) are a class of metalearning models popular for combining the runtime efficiency of amortized inference with reliable uncertainty quantification. Many relevant machine learning tasks, such as spatio-temporal modeling, Bayesian Optimization and continuous control, contain equivariances -- for example to translation -- which the model can exploit for maximal performance. However, prior attempts to include equivariances in CNPs do not scale effectively beyond two input dimensions. In this work, we propose Relational Conditional Neural Processes (RCNPs), an effective approach to incorporate equivariances into any neural process model. Our proposed method extends the applicability and impact of equivariant neural processes to higher dimensions. We empirically demonstrate the competitive performance of RCNPs on a large array of tasks naturally containing equivariances.

* 29 pages, 5 figures

Via

Access Paper or Ask Questions

Conditional Mutual Information for Disentangled Representations in Reinforcement Learning

May 23, 2023

Mhairi Dunion, Trevor McInroe, Kevin Sebastian Luck, Josiah P. Hanna, Stefano V. Albrecht

Figure 1 for Conditional Mutual Information for Disentangled Representations in Reinforcement Learning

Figure 2 for Conditional Mutual Information for Disentangled Representations in Reinforcement Learning

Figure 3 for Conditional Mutual Information for Disentangled Representations in Reinforcement Learning

Figure 4 for Conditional Mutual Information for Disentangled Representations in Reinforcement Learning

Abstract:Reinforcement Learning (RL) environments can produce training data with spurious correlations between features due to the amount of training data or its limited feature coverage. This can lead to RL agents encoding these misleading correlations in their latent representation, preventing the agent from generalising if the correlation changes within the environment or when deployed in the real world. Disentangled representations can improve robustness, but existing disentanglement techniques that minimise mutual information between features require independent features, thus they cannot disentangle correlated features. We propose an auxiliary task for RL algorithms that learns a disentangled representation of high-dimensional observations with correlated features by minimising the conditional mutual information between features in the representation. We demonstrate experimentally, using continuous control tasks, that our approach improves generalisation under correlation shifts, as well as improving the training performance of RL algorithms in the presence of correlated features.

Via

Access Paper or Ask Questions

Co-Imitation: Learning Design and Behaviour by Imitation

Sep 02, 2022

Chang Rajani, Karol Arndt, David Blanco-Mulero, Kevin Sebastian Luck, Ville Kyrki

Figure 1 for Co-Imitation: Learning Design and Behaviour by Imitation

Figure 2 for Co-Imitation: Learning Design and Behaviour by Imitation

Figure 3 for Co-Imitation: Learning Design and Behaviour by Imitation

Figure 4 for Co-Imitation: Learning Design and Behaviour by Imitation

Abstract:The co-adaptation of robots has been a long-standing research endeavour with the goal of adapting both body and behaviour of a system for a given task, inspired by the natural evolution of animals. Co-adaptation has the potential to eliminate costly manual hardware engineering as well as improve the performance of systems. The standard approach to co-adaptation is to use a reward function for optimizing behaviour and morphology. However, defining and constructing such reward functions is notoriously difficult and often a significant engineering effort. This paper introduces a new viewpoint on the co-adaptation problem, which we call co-imitation: finding a morphology and a policy that allow an imitator to closely match the behaviour of a demonstrator. To this end we propose a co-imitation methodology for adapting behaviour and morphology by matching state distributions of the demonstrator. Specifically, we focus on the challenging scenario with mismatched state- and action-spaces between both agents. We find that co-imitation increases behaviour similarity across a variety of tasks and settings, and demonstrate co-imitation by transferring human walking, jogging and kicking skills onto a simulated humanoid.

* 14 pages, 11 figures, submitted

Via

Access Paper or Ask Questions

What Robot do I Need? Fast Co-Adaptation of Morphology and Control using Graph Neural Networks

Nov 03, 2021

Kevin Sebastian Luck, Roberto Calandra, Michael Mistry

Figure 1 for What Robot do I Need? Fast Co-Adaptation of Morphology and Control using Graph Neural Networks

Figure 2 for What Robot do I Need? Fast Co-Adaptation of Morphology and Control using Graph Neural Networks

Figure 3 for What Robot do I Need? Fast Co-Adaptation of Morphology and Control using Graph Neural Networks

Figure 4 for What Robot do I Need? Fast Co-Adaptation of Morphology and Control using Graph Neural Networks

Abstract:The co-adaptation of robot morphology and behaviour becomes increasingly important with the advent of fast 3D-manufacturing methods and efficient deep reinforcement learning algorithms. A major challenge for the application of co-adaptation methods to the real world is the simulation-to-reality-gap due to model and simulation inaccuracies. However, prior work focuses primarily on the study of evolutionary adaptation of morphologies exploiting analytical models and (differentiable) simulators with large population sizes, neglecting the existence of the simulation-to-reality-gap and the cost of manufacturing cycles in the real world. This paper presents a new approach combining classic high-frequency deep neural networks with computational expensive Graph Neural Networks for the data-efficient co-adaptation of agents with varying numbers of degrees-of-freedom. Evaluations in simulation show that the new method can co-adapt agents within such a limited number of production cycles by efficiently combining design optimization with offline reinforcement learning, that it allows for the direct application to real-world co-adaptation tasks in future work

Via

Access Paper or Ask Questions

Residual Learning from Demonstration

Aug 18, 2020

Todor Davchev, Kevin Sebastian Luck, Michael Burke, Franziska Meier, Stefan Schaal, Subramanian Ramamoorthy

Figure 1 for Residual Learning from Demonstration

Figure 2 for Residual Learning from Demonstration

Figure 3 for Residual Learning from Demonstration

Figure 4 for Residual Learning from Demonstration

Abstract:Contacts and friction are inherent to nearly all robotic manipulation tasks. Through the motor skill of insertion, we study how robots can learn to cope when these attributes play a salient role. In this work we propose residual learning from demonstration (rLfD), a framework that combines dynamic movement primitives (DMP) that rely on behavioural cloning with a reinforcement learning (RL) based residual correction policy. The proposed solution is applied directly in task space and operates on the full pose of the robot. We show that rLfD outperforms alternatives and improves the generalisation abilities of DMPs. We evaluate this approach by training an agent to successfully perform both simulated and real world insertions of pegs, gears and plugs into respective sockets.

Via

Access Paper or Ask Questions

Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient

Nov 15, 2019

Kevin Sebastian Luck, Mel Vecerik, Simon Stepputtis, Heni Ben Amor, Jonathan Scholz

Figure 1 for Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient

Figure 2 for Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient

Figure 3 for Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient

Figure 4 for Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient

Abstract:Model-free reinforcement learning algorithms such as Deep Deterministic Policy Gradient (DDPG) often require additional exploration strategies, especially if the actor is of deterministic nature. This work evaluates the use of model-based trajectory optimization methods used for exploration in Deep Deterministic Policy Gradient when trained on a latent image embedding. In addition, an extension of DDPG is derived using a value function as critic, making use of a learned deep dynamics model to compute the policy gradient. This approach leads to a symbiotic relationship between the deep reinforcement learning algorithm and the latent trajectory optimizer. The trajectory optimizer benefits from the critic learned by the RL algorithm and the latter from the enhanced exploration generated by the planner. The developed methods are evaluated on two continuous control tasks, one in simulation and one in the real world. In particular, a Baxter robot is trained to perform an insertion task, while only receiving sparse rewards and images as observations from the environment.

* Accepted for IROS 2019

Via

Access Paper or Ask Questions

Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning

Nov 15, 2019

Kevin Sebastian Luck, Heni Ben Amor, Roberto Calandra

Figure 1 for Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning

Figure 2 for Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning

Figure 3 for Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning

Figure 4 for Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning

Abstract:Humans and animals are capable of quickly learning new behaviours to solve new tasks. Yet, we often forget that they also rely on a highly specialized morphology that co-adapted with motor control throughout thousands of years. Although compelling, the idea of co-adapting morphology and behaviours in robots is often unfeasible because of the long manufacturing times, and the need to re-design an appropriate controller for each morphology. In this paper, we propose a novel approach to automatically and efficiently co-adapt a robot morphology and its controller. Our approach is based on recent advances in deep reinforcement learning, and specifically the soft actor critic algorithm. Key to our approach is the possibility of leveraging previously tested morphologies and behaviors to estimate the performance of new candidate morphologies. As such, we can make full use of the information available for making more informed decisions, with the ultimate goal of achieving a more data-efficient co-adaptation (i.e., reducing the number of morphologies and behaviors tested). Simulated experiments show that our approach requires drastically less design prototypes to find good morphology-behaviour combinations, making this method particularly suitable for future co-adaptation of robot designs in the real world.

* Accepted for the Conference on Robot Learning 2019

Via

Access Paper or Ask Questions

From the Lab to the Desert: Fast Prototyping and Learning of Robot Locomotion

Jun 06, 2017

Kevin Sebastian Luck, Joseph Campbell, Michael Andrew Jansen, Daniel M. Aukes, Heni Ben Amor

Figure 1 for From the Lab to the Desert: Fast Prototyping and Learning of Robot Locomotion

Figure 2 for From the Lab to the Desert: Fast Prototyping and Learning of Robot Locomotion

Figure 3 for From the Lab to the Desert: Fast Prototyping and Learning of Robot Locomotion

Figure 4 for From the Lab to the Desert: Fast Prototyping and Learning of Robot Locomotion

Abstract:We present a methodology for fast prototyping of morphologies and controllers for robot locomotion. Going beyond simulation-based approaches, we argue that the form and function of a robot, as well as their interplay with real-world environmental conditions are critical. Hence, fast design and learning cycles are necessary to adapt robot shape and behavior to their environment. To this end, we present a combination of laminate robot manufacturing and sample-efficient reinforcement learning. We leverage this methodology to conduct an extensive robot learning experiment. Inspired by locomotion in sea turtles, we design a low-cost crawling robot with variable, interchangeable fins. Learning is performed using both bio-inspired and original fin designs in an artificial indoor environment as well as a natural environment in the Arizona desert. The findings of this study show that static policies developed in the laboratory do not translate to effective locomotion strategies in natural environments. In contrast to that, sample-efficient reinforcement learning can help to rapidly accommodate changes in the environment or the robot.

* Submitted to Robotics: Science and Systems (RSS 2017)

Via

Access Paper or Ask Questions