Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Adrian Röfer

The Unreasonable Effectiveness of Discrete-Time Gaussian Process Mixtures for Robot Policy Learning

May 06, 2025

Jan Ole von Hartz, Adrian Röfer, Joschka Boedecker, Abhinav Valada

Abstract:We present Mixture of Discrete-time Gaussian Processes (MiDiGap), a novel approach for flexible policy representation and imitation learning in robot manipulation. MiDiGap enables learning from as few as five demonstrations using only camera observations and generalizes across a wide range of challenging tasks. It excels at long-horizon behaviors such as making coffee, highly constrained motions such as opening doors, dynamic actions such as scooping with a spatula, and multimodal tasks such as hanging a mug. MiDiGap learns these tasks on a CPU in less than a minute and scales linearly to large datasets. We also develop a rich suite of tools for inference-time steering using evidence such as collision signals and robot kinematic constraints. This steering enables novel generalization capabilities, including obstacle avoidance and cross-embodiment policy transfer. MiDiGap achieves state-of-the-art performance on diverse few-shot manipulation benchmarks. On constrained RLBench tasks, it improves policy success by 76 percentage points and reduces trajectory cost by 67%. On multimodal tasks, it improves policy success by 48 percentage points and increases sample efficiency by a factor of 20. In cross-embodiment transfer, it more than doubles policy success. We make the code publicly available at https://midigap.cs.uni-freiburg.de.

* Submitted for publication to IEEE Transaction on Robotics

Via

Access Paper or Ask Questions

Learning Few-Shot Object Placement with Intra-Category Transfer

Nov 05, 2024

Adrian Röfer, Russell Buchanan, Max Argus, Sethu Vijayakumar, Abhinav Valada

Figure 1 for Learning Few-Shot Object Placement with Intra-Category Transfer

Figure 2 for Learning Few-Shot Object Placement with Intra-Category Transfer

Figure 3 for Learning Few-Shot Object Placement with Intra-Category Transfer

Figure 4 for Learning Few-Shot Object Placement with Intra-Category Transfer

Abstract:Efficient learning from demonstration for long-horizon tasks remains an open challenge in robotics. While significant effort has been directed toward learning trajectories, a recent resurgence of object-centric approaches has demonstrated improved sample efficiency, enabling transferable robotic skills. Such approaches model tasks as a sequence of object poses over time. In this work, we propose a scheme for transferring observed object arrangements to novel object instances by learning these arrangements on canonical class frames. We then employ this scheme to enable a simple yet effective approach for training models from as few as five demonstrations to predict arrangements of a wide range of objects including tableware, cutlery, furniture, and desk spaces. We propose a method for optimizing the learned models to enables efficient learning of tasks such as setting a table or tidying up an office with intra-category transfer, even in the presence of distractors. We present extensive experimental results in simulation and on a real robotic system for table setting which, based on human evaluations, scored 73.3% compared to a human baseline. We make the code and trained models publicly available at http://oplict.cs.uni-freiburg.de.

* 8 pages, 7 figures, 2 tables, submitted to RA-L

Via

Access Paper or Ask Questions

Imagine2touch: Predictive Tactile Sensing for Robotic Manipulation using Efficient Low-Dimensional Signals

May 02, 2024

Abdallah Ayad, Adrian Röfer, Nick Heppert, Abhinav Valada

Abstract:Humans seemingly incorporate potential touch signals in their perception. Our goal is to equip robots with a similar capability, which we term Imagine2touch. Imagine2touch aims to predict the expected touch signal based on a visual patch representing the area to be touched. We use ReSkin, an inexpensive and compact touch sensor to collect the required dataset through random touching of five basic geometric shapes, and one tool. We train Imagine2touch on two out of those shapes and validate it on the ood. tool. We demonstrate the efficacy of Imagine2touch through its application to the downstream task of object recognition. In this task, we evaluate Imagine2touch performance in two experiments, together comprising 5 out of training distribution objects. Imagine2touch achieves an object recognition accuracy of 58% after ten touches per object, surpassing a proprioception baseline.

* 3 pages, 3 figures, 2 tables, accepted at ViTac2024 ICRA2024 Workshop. arXiv admin note: substantial text overlap with arXiv:2403.15107

Via

Access Paper or Ask Questions

PseudoTouch: Efficiently Imaging the Surface Feel of Objects for Robotic Manipulation

Mar 22, 2024

Adrian Röfer, Nick Heppert, Abdallah Ayman, Eugenio Chisari, Abhinav Valada

Figure 1 for PseudoTouch: Efficiently Imaging the Surface Feel of Objects for Robotic Manipulation

Figure 2 for PseudoTouch: Efficiently Imaging the Surface Feel of Objects for Robotic Manipulation

Figure 3 for PseudoTouch: Efficiently Imaging the Surface Feel of Objects for Robotic Manipulation

Figure 4 for PseudoTouch: Efficiently Imaging the Surface Feel of Objects for Robotic Manipulation

Abstract:Humans seemingly incorporate potential touch signals in their perception. Our goal is to equip robots with a similar capability, which we term \ourmodel. \ourmodel aims to predict the expected touch signal based on a visual patch representing the touched area. We frame this problem as the task of learning a low-dimensional visual-tactile embedding, wherein we encode a depth patch from which we decode the tactile signal. To accomplish this task, we employ ReSkin, an inexpensive and replaceable magnetic-based tactile sensor. Using ReSkin, we collect and train PseudoTouch on a dataset comprising aligned tactile and visual data pairs obtained through random touching of eight basic geometric shapes. We demonstrate the efficacy of PseudoTouch through its application to two downstream tasks: object recognition and grasp stability prediction. In the object recognition task, we evaluate the learned embedding's performance on a set of five basic geometric shapes and five household objects. Using PseudoTouch, we achieve an object recognition accuracy 84% after just ten touches, surpassing a proprioception baseline. For the grasp stability task, we use ACRONYM labels to train and evaluate a grasp success predictor using PseudoTouch's predictions derived from virtual depth information. Our approach yields an impressive 32% absolute improvement in accuracy compared to the baseline relying on partial point cloud data. We make the data, code, and trained models publicly available at http://pseudotouch.cs.uni-freiburg.de.

* 8 pages, 7 figures, 2 tables, submitted to IROS2024

Via

Access Paper or Ask Questions

Bayesian Optimization for Sample-Efficient Policy Improvement in Robotic Manipulation

Mar 21, 2024

Adrian Röfer, Iman Nematollahi, Tim Welschehold, Wolfram Burgard, Abhinav Valada

Abstract:Sample efficient learning of manipulation skills poses a major challenge in robotics. While recent approaches demonstrate impressive advances in the type of task that can be addressed and the sensing modalities that can be incorporated, they still require large amounts of training data. Especially with regard to learning actions on robots in the real world, this poses a major problem due to the high costs associated with both demonstrations and real-world robot interactions. To address this challenge, we introduce BOpt-GMM, a hybrid approach that combines imitation learning with own experience collection. We first learn a skill model as a dynamical system encoded in a Gaussian Mixture Model from a few demonstrations. We then improve this model with Bayesian optimization building on a small number of autonomous skill executions in a sparse reward setting. We demonstrate the sample efficiency of our approach on multiple complex manipulation skills in both simulations and real-world experiments. Furthermore, we make the code and pre-trained models publicly available at http://bopt-gmm. cs.uni-freiburg.de.

* 7 pages, 5 figures, 2 tables, submitted to IROS2024

Via

Access Paper or Ask Questions

Online Estimation of Articulated Objects with Factor Graphs using Vision and Proprioceptive Sensing

Sep 28, 2023

Russell Buchanan, Adrian Röfer, João Moura, Abhinav Valada, Sethu Vijayakumar

Abstract:From dishwashers to cabinets, humans interact with articulated objects every day, and for a robot to assist in common manipulation tasks, it must learn a representation of articulation. Recent deep learning learning methods can provide powerful vision-based priors on the affordance of articulated objects from previous, possibly simulated, experiences. In contrast, many works estimate articulation by observing the object in motion, requiring the robot to already be interacting with the object. In this work, we propose to use the best of both worlds by introducing an online estimation method that merges vision-based affordance predictions from a neural network with interactive kinematic sensing in an analytical model. Our work has the benefit of using vision to predict an articulation model before touching the object, while also being able to update the model quickly from kinematic sensing during the interaction. In this paper, we implement a full system using shared autonomy for robotic opening of articulated objects, in particular objects in which the articulation is not apparent from vision alone. We implemented our system on a real robot and performed several autonomous closed-loop experiments in which the robot had to open a door with unknown joint while estimating the articulation online. Our system achieved an 80% success rate for autonomous opening of unknown articulated objects.

Via

Access Paper or Ask Questions

Doing Right by Not Doing Wrong in Human-Robot Collaboration

Feb 05, 2022

Laura Londoño, Adrian Röfer, Tim Welschehold, Abhinav Valada

Abstract:As robotic systems become more and more capable of assisting humans in their everyday lives, we must consider the opportunities for these artificial agents to make their human collaborators feel unsafe or to treat them unfairly. Robots can exhibit antisocial behavior causing physical harm to people or reproduce unfair behavior replicating and even amplifying historical and societal biases which are detrimental to humans they interact with. In this paper, we discuss these issues considering sociable robotic manipulation and fair robotic decision making. We propose a novel approach to learning fair and sociable behavior, not by reproducing positive behavior, but rather by avoiding negative behavior. In this study, we highlight the importance of incorporating sociability in robot manipulation, as well as the need to consider fairness in human-robot interactions.

Via

Access Paper or Ask Questions

Robot Skill Adaptation via Soft Actor-Critic Gaussian Mixture Models

Nov 25, 2021

Iman Nematollahi, Erick Rosete-Beas, Adrian Röfer, Tim Welschehold, Abhinav Valada, Wolfram Burgard

Figure 1 for Robot Skill Adaptation via Soft Actor-Critic Gaussian Mixture Models

Figure 2 for Robot Skill Adaptation via Soft Actor-Critic Gaussian Mixture Models

Figure 3 for Robot Skill Adaptation via Soft Actor-Critic Gaussian Mixture Models

Figure 4 for Robot Skill Adaptation via Soft Actor-Critic Gaussian Mixture Models

Abstract:A core challenge for an autonomous agent acting in the real world is to adapt its repertoire of skills to cope with its noisy perception and dynamics. To scale learning of skills to long-horizon tasks, robots should be able to learn and later refine their skills in a structured manner through trajectories rather than making instantaneous decisions individually at each time step. To this end, we propose the Soft Actor-Critic Gaussian Mixture Model (SAC-GMM), a novel hybrid approach that learns robot skills through a dynamical system and adapts the learned skills in their own trajectory distribution space through interactions with the environment. Our approach combines classical robotics techniques of learning from demonstration with the deep reinforcement learning framework and exploits their complementary nature. We show that our method utilizes sensors solely available during the execution of preliminarily learned skills to extract relevant features that lead to faster skill refinement. Extensive evaluations in both simulation and real-world environments demonstrate the effectiveness of our method in refining robot skills by leveraging physical interactions, high-dimensional sensory data, and sparse task completion rewards. Videos, code, and pre-trained models are available at \url{http://sac-gmm.cs.uni-freiburg.de}.

* Submitted to the 2022 IEEE International Conference on Robotics and Automation (ICRA)

Via

Access Paper or Ask Questions

Kineverse: A Symbolic Articulation Model Framework for Model-Generic Software for Mobile Manipulation

Dec 09, 2020

Adrian Röfer, Georg Bartels, Michael Beetz

Figure 1 for Kineverse: A Symbolic Articulation Model Framework for Model-Generic Software for Mobile Manipulation

Figure 2 for Kineverse: A Symbolic Articulation Model Framework for Model-Generic Software for Mobile Manipulation

Figure 3 for Kineverse: A Symbolic Articulation Model Framework for Model-Generic Software for Mobile Manipulation

Figure 4 for Kineverse: A Symbolic Articulation Model Framework for Model-Generic Software for Mobile Manipulation

Abstract:Human developers want to program robots using abstract instructions, such as "fetch the milk from the fridge". To translate such instructions into actionable plans, the robot's software requires in-depth background knowledge. With regards to interactions with articulated objects such as doors and drawers, the robot requires a model that it can use for state estimation and motion planning. Existing articulation model frameworks take a descriptive approach to model building, which requires additional background knowledge to construct mathematical models for computation. In this paper, we introduce the articulation model framework Kineverse which uses symbolic mathematical expressions to model articulated objects. We provide a theoretical description of this framework, and the operations that are supported by its models, and suggest a software architecture for integrating our framework in a robotics application. To demonstrate the applicability of our framework to robotics, we employ it in solving two common robotics problems from state estimation and manipulation.

* 7 pages, 7 figures, Submitted to IEEE International Conference on Robotics and Automation 2021 (ICRA 2021)

Via

Access Paper or Ask Questions

Semantic Linking Maps for Active Visual Object Search

Jun 18, 2020

Zhen Zeng, Adrian Röfer, Odest Chadwicke Jenkins

Figure 1 for Semantic Linking Maps for Active Visual Object Search

Figure 2 for Semantic Linking Maps for Active Visual Object Search

Figure 3 for Semantic Linking Maps for Active Visual Object Search

Figure 4 for Semantic Linking Maps for Active Visual Object Search

Abstract:We aim for mobile robots to function in a variety of common human environments. Such robots need to be able to reason about the locations of previously unseen target objects. Landmark objects can help this reasoning by narrowing down the search space significantly. More specifically, we can exploit background knowledge about common spatial relations between landmark and target objects. For example, seeing a table and knowing that cups can often be found on tables aids the discovery of a cup. Such correlations can be expressed as distributions over possible pairing relationships of objects. In this paper, we propose an active visual object search strategy method through our introduction of the Semantic Linking Maps (SLiM) model. SLiM simultaneously maintains the belief over a target object's location as well as landmark objects' locations, while accounting for probabilistic inter-object spatial relations. Based on SLiM, we describe a hybrid search strategy that selects the next best view pose for searching for the target object based on the maintained belief. We demonstrate the efficiency of our SLiM-based search strategy through comparative experiments in simulated environments. We further demonstrate the real-world applicability of SLiM-based search in scenarios with a Fetch mobile manipulation robot.

* Published in ICRA 2020 (Best Paper Award in Cognitive Robotics)

Via

Access Paper or Ask Questions