Abstract:Robotic manipulation requires accurate motion and physical interaction control. However, current robot learning approaches focus on motion-centric action spaces that do not explicitly give the policy control over the interaction. In this paper, we discuss the repercussions of this choice and argue for more interaction-explicit action spaces in robot learning.
Abstract:We study the choice of action space in robot manipulation learning and sim-to-real transfer. We define metrics that assess the performance, and examine the emerging properties in the different action spaces. We train over 250 reinforcement learning~(RL) agents in simulated reaching and pushing tasks, using 13 different control spaces. The choice of action spaces spans popular choices in the literature as well as novel combinations of common design characteristics. We evaluate the training performance in simulation and the transfer to a real-world environment. We identify good and bad characteristics of robotic action spaces and make recommendations for future designs. Our findings have important implications for the design of RL algorithms for robot manipulation tasks, and highlight the need for careful consideration of action spaces when training and transferring RL agents for real-world robotics.
Abstract:Versatile movement representations allow robots to learn new tasks and rapidly adapt them to environmental changes, e.g. introduction of obstacles, placing additional robots in the workspace, modification of the joint range due to faults or range of motion constraints due to tool manipulation. Probabilistic movement primitives (ProMP) model robot movements as a distribution over trajectories and they are an important tool due to their analytical tractability and ability to learn and generalise from a small number of demonstrations. Current approaches solve specific adaptation problems, e.g. obstacle avoidance, however, a generic probabilistic approach to adaptation has not yet been developed. In this paper we propose a generic probabilistic framework for adapting ProMPs. We formulate adaptation as a constrained optimisation problem where we minimise the Kullback-Leibler divergence between the adapted distribution and the distribution of the original primitive and we constrain the probability mass associated with undesired trajectories to be low. We derive several types of constraints that can be added depending on the task, such us joint limiting, various types of obstacle avoidance, via-points, and mutual avoidance, under a common framework. We demonstrate our approach on several adaptation problems on simulated planar robot arms and 7-DOF Franka-Emika robots in single and dual robot arm settings.