Abstract:Close and precise placement of irregularly shaped objects requires a skilled robotic system. Particularly challenging is the manipulation of objects that have sensitive top surfaces and a fixed set of neighbors. To avoid damaging the surface, they have to be grasped from the side, and during placement, their neighbor relations have to be maintained. In this work, we train a reinforcement learning agent that generates smooth end-effector motions to place objects as close as possible next to each other. During the placement, our agent considers neighbor constraints defined in a given layout of the objects while trying to avoid collisions. Our approach learns to place compact object assemblies without the need for predefined spacing between objects as required by traditional methods. We thoroughly evaluated our approach using a two-finger gripper mounted to a robotic arm with six degrees of freedom. The results show that our agent outperforms two baseline approaches in terms of object assembly compactness, thereby reducing the needed space to place the objects according to the given neighbor constraints. On average, our approach reduces the distances between all placed objects by at least 60%, with fewer collisions at the same compactness compared to both baselines.
Abstract:Preference-aligned robot navigation in human environments is typically achieved through learning-based approaches, utilizing demonstrations and user feedback for personalization. However, personal preferences are subject to change and might even be context-dependent. Yet traditional reinforcement learning (RL) approaches with a static reward function often fall short in adapting to these varying user preferences. This paper introduces a framework that combines multi-objective reinforcement learning (MORL) with demonstration-based learning. Our approach allows for dynamic adaptation to changing user preferences without retraining. Through rigorous evaluations, including sim-to-real and robot-to-robot transfers, we demonstrate our framework's capability to reflect user preferences accurately while achieving high navigational performance in terms of collision avoidance and goal pursuance.
Abstract:Mobile robots are increasingly being used in noisy environments for social purposes, e.g. to provide support in healthcare or public spaces. Since these robots also operate beyond human sight, the question arises as to how different robot types, ambient noise or cognitive engagement impacts the detection of the robots by their sound. To address this research gap, we conducted a user study measuring auditory detection distances for a wheeled (Turtlebot 2i) and quadruped robot (Unitree Go 1), which emit different consequential sounds when moving. Additionally, we also manipulated background noise levels and participants' engagement in a secondary task during the study. Our results showed that the quadruped robot sound was detected significantly better (i.e., at a larger distance) than the wheeled one, which demonstrates that the movement mechanism has a meaningful impact on the auditory detectability. The detectability for both robots diminished significantly as background noise increased. But even in high background noise, participants detected the quadruped robot at a significantly larger distance. The engagement in a secondary task had hardly any impact. In essence, these findings highlight the critical role of distinguishing auditory characteristics of different robots to improve the smooth human-centered navigation of mobile robots in noisy environments.
Abstract:To align mobile robot navigation policies with user preferences through reinforcement learning from human feedback (RLHF), reliable and behavior-diverse user queries are required. However, deterministic policies fail to generate a variety of navigation trajectory suggestions for a given navigation task configuration. We introduce EnQuery, a query generation approach using an ensemble of policies that achieve behavioral diversity through a regularization term. For a given navigation task, EnQuery produces multiple navigation trajectory suggestions, thereby optimizing the efficiency of preference data collection with fewer queries. Our methodology demonstrates superior performance in aligning navigation policies with user preferences in low-query regimes, offering enhanced policy convergence from sparse preference queries. The evaluation is complemented with a novel explainability representation, capturing full scene navigation behavior of the mobile robot in a single plot.
Abstract:In 1997, the very first tour guide robot RHINO was deployed in a museum in Germany. With the ability to navigate autonomously through the environment, the robot gave tours to over 2,000 visitors. Today, RHINO itself has become an exhibit and is no longer operational. In this paper, we present RHINO-VR, an interactive museum exhibit using virtual reality (VR) that allows museum visitors to experience the historical robot RHINO in operation in a virtual museum. RHINO-VR, unlike static exhibits, enables users to familiarize themselves with basic mobile robotics concepts without the fear of damaging the exhibit. In the virtual environment, the user is able to interact with RHINO in VR by pointing to a location to which the robot should navigate and observing the corresponding actions of the robot. To include other visitors who cannot use the VR, we provide an external observation view to make RHINO visible to them. We evaluated our system by measuring the frame rate of the VR simulation, comparing the generated virtual 3D models with the originals, and conducting a user study. The user-study showed that RHINO-VR improved the visitors' understanding of the robot's functionality and that they would recommend experiencing the VR exhibit to others.
Abstract:Foresighted robot navigation in dynamic indoor environments with cost-efficient hardware necessitates the use of a lightweight yet dependable controller. So inferring the scene dynamics from sensor readings without explicit object tracking is a pivotal aspect of foresighted navigation among pedestrians. In this paper, we introduce a spatiotemporal attention pipeline for enhanced navigation based on 2D lidar sensor readings. This pipeline is complemented by a novel lidar-state representation that emphasizes dynamic obstacles over static ones. Subsequently, the attention mechanism enables selective scene perception across both space and time, resulting in improved overall navigation performance within dynamic scenarios. We thoroughly evaluated the approach in different scenarios and simulators, finding good generalization to unseen environments. The results demonstrate outstanding performance compared to state-of-the-art methods, thereby enabling the seamless deployment of the learned controller on a real robot.
Abstract:Collision-free, goal-directed navigation in environments containing unknown static and dynamic obstacles is still a great challenge, especially when manual tuning of navigation policies or costly motion prediction needs to be avoided. In this paper, we therefore propose a subgoal-driven hierarchical navigation architecture that is trained with deep reinforcement learning and decouples obstacle avoidance and motor control. In particular, we separate the navigation task into the prediction of the next subgoal position for avoiding collisions while moving toward the final target position, and the prediction of the robot's velocity controls. By relying on 2D lidar, our method learns to avoid obstacles while still achieving goal-directed behavior as well as to generate low-level velocity control commands to reach the subgoals. In our architecture, we apply the attention mechanism on the robot's 2D lidar readings and compute the importance of lidar scan segments for avoiding collisions. As we show in simulated and real-world experiments with a Turtlebot robot, our proposed method leads to smooth and safe trajectories among humans and significantly outperforms a state-of-the-art approach in terms of success rate. A supplemental video describing our approach is available online.
Abstract:When arranging objects with robotic arms, the quality of the end result strongly depends on the achievable placement accuracy. However, even the most advanced robotic systems are prone to positioning errors that can occur at different steps of the manipulation process. Ignoring such errors can lead to the partial or complete failure of the arrangement. In this paper, we present a novel approach to autonomously detect and correct misplaced objects by pushing them with a robotic arm. We thoroughly tested our approach both in simulation and on real hardware using a Robotiq two-finger gripper mounted on a UR5 robotic arm. In our evaluation, we demonstrate the successful compensation for different errors injected during the manipulation of regular shaped objects. Consequently, we achieve a highly reliable object placement accuracy in the millimeter range.
Abstract:For the best human-robot interaction experience, the robot's navigation policy should take into account personal preferences of the user. In this paper, we present a learning framework complemented by a perception pipeline to train a depth vision-based, personalized navigation controller from user demonstrations. Our refined virtual reality interface enables the demonstration of robot navigation trajectories under motion of the user for dynamic interaction scenarios. In a detailed analysis, we evaluate different configurations of the perception pipeline. As the experiments demonstrate, our new pipeline compresses the perceived depth images to a latent state representation and, thus, enables efficient reasoning about the robot's dynamic environment to the learning. We discuss the robot's navigation performance in various virtual scenes by enrolling a variational autoencoder in combination with a motion predictor and demonstrate the first personalized robot navigation controller that solely relies on depth images.
Abstract:Reinforcement learning (RL) has recently proven great success in various domains. Yet, the design of the reward function requires detailed domain expertise and tedious fine-tuning to ensure that agents are able to learn the desired behaviour. Using a sparse reward conveniently mitigates these challenges. However, the sparse reward represents a challenge on its own, often resulting in unsuccessful training of the agent. In this paper, we therefore address the sparse reward problem in RL. Our goal is to find an effective alternative to reward shaping, without using costly human demonstrations, that would also be applicable to a wide range of domains. Hence, we propose to use model predictive control~(MPC) as an experience source for training RL agents in sparse reward environments. Without the need for reward shaping, we successfully apply our approach in the field of mobile robot navigation both in simulation and real-world experiments with a Kuboki Turtlebot 2. We furthermore demonstrate great improvement over pure RL algorithms in terms of success rate as well as number of collisions and timeouts. Our experiments show that MPC as an experience source improves the agent's learning process for a given task in the case of sparse rewards.