Abstract:Teaching robots dexterous manipulation skills, such as tool use, presents a significant challenge. Current approaches can be broadly categorized into two strategies: human teleoperation (for imitation learning) and sim-to-real reinforcement learning. The first approach is difficult as it is hard for humans to produce safe and dexterous motions on a different embodiment without touch feedback. The second RL-based approach struggles with the domain gap and involves highly task-specific reward engineering on complex tasks. Our key insight is that RL is effective at learning low-level motion primitives, while humans excel at providing coarse motion commands for complex, long-horizon tasks. Therefore, the optimal solution might be a combination of both approaches. In this paper, we introduce DexterityGen (DexGen), which uses RL to pretrain large-scale dexterous motion primitives, such as in-hand rotation or translation. We then leverage this learned dataset to train a dexterous foundational controller. In the real world, we use human teleoperation as a prompt to the controller to produce highly dexterous behavior. We evaluate the effectiveness of DexGen in both simulation and real world, demonstrating that it is a general-purpose controller that can realize input dexterous manipulation commands and significantly improves stability by 10-100x measured as duration of holding objects across diverse tasks. Notably, with DexGen we demonstrate unprecedented dexterous skills including diverse object reorientation and dexterous tool use such as pen, syringe, and screwdriver for the first time.
Abstract:Considerable efforts have been made to improve monocular depth estimation under ideal conditions. However, in challenging environments, monocular depth estimation still faces difficulties. In this paper, we introduce visual prompt learning for predicting depth across different environments within a unified model, and present a self-supervised learning framework called PromptMono. It employs a set of learnable parameters as visual prompts to capture domain-specific knowledge. To integrate prompting information into image representations, a novel gated cross prompting attention (GCPA) module is proposed, which enhances the depth estimation in diverse conditions. We evaluate the proposed PromptMono on the Oxford Robotcar dataset and the nuScenes dataset. Experimental results demonstrate the superior performance of the proposed method.
Abstract:Interacting with human agents in complex scenarios presents a significant challenge for robotic navigation, particularly in environments that necessitate both collision avoidance and collaborative interaction, such as indoor spaces. Unlike static or predictably moving obstacles, human behavior is inherently complex and unpredictable, stemming from dynamic interactions with other agents. Existing simulation tools frequently fail to adequately model such reactive and collaborative behaviors, impeding the development and evaluation of robust social navigation strategies. This paper introduces a novel framework utilizing distributed potential games to simulate human-like interactions in highly interactive scenarios. Within this framework, each agent imagines a virtual cooperative game with others based on its estimation. We demonstrate this formulation can facilitate the generation of diverse and realistic interaction patterns in a configurable manner across various scenarios. Additionally, we have developed a gym-like environment leveraging our interactive agent model to facilitate the learning and evaluation of interactive navigation algorithms.
Abstract:Learning contact-rich manipulation skills is essential. Such skills require the robots to interact with the environment with feasible manipulation trajectories and suitable compliance control parameters to enable safe and stable contact. However, learning these skills is challenging due to data inefficiency in the real world and the sim-to-real gap in simulation. In this paper, we introduce a hybrid offline-online framework to learn robust manipulation skills. We employ model-free reinforcement learning for the offline phase to obtain the robot motion and compliance control parameters in simulation \RV{with domain randomization}. Subsequently, in the online phase, we learn the residual of the compliance control parameters to maximize robot performance-related criteria with force sensor measurements in real time. To demonstrate the effectiveness and robustness of our approach, we provide comparative results against existing methods for assembly, pivoting, and screwing tasks.
Abstract:Constrained environments are common in practical applications of manipulating deformable linear objects (DLOs), where movements of both DLOs and robots should be constrained. This task is high-dimensional and highly constrained owing to the highly deformable DLOs, dual-arm robots with high degrees of freedom, and 3-D complex environments, which render global planning challenging. Furthermore, accurate DLO models needed by planning are often unavailable owing to their strong nonlinearity and diversity, resulting in unreliable planned paths. This article focuses on the global moving and shaping of DLOs in constrained environments by dual-arm robots. The main objectives are 1) to efficiently and accurately accomplish this task, and 2) to achieve generalizable and robust manipulation of various DLOs. To this end, we propose a complementary framework with whole-body planning and control using appropriate DLO model representations. First, a global planner is proposed to efficiently find feasible solutions based on a simplified DLO energy model, which considers the full system states and all constraints to plan more reliable paths. Then, a closed-loop manipulation scheme is proposed to compensate for the modeling errors and enhance the robustness and accuracy, which incorporates a model predictive controller that real-time adjusts the robot motion based on an adaptive DLO motion model. The key novelty is that our framework can efficiently solve the high-dimensional problem subject to multiple constraints and generalize to various DLOs without elaborate model identifications. Experiments demonstrate that our framework can accomplish considerably more complicated tasks than existing works, with significantly higher efficiency, generalizability, and reliability.
Abstract:Interactive behavior modeling of multiple agents is an essential challenge in simulation, especially in scenarios when agents need to avoid collisions and cooperate at the same time. Humans can interact with others without explicit communication and navigate in scenarios when cooperation is required. In this work, we aim to model human interactions in this realistic setting, where each agent acts based on its observation and does not communicate with others. We propose a framework based on distributed potential games, where each agent imagines a cooperative game with other agents and solves the game using its estimation of their behavior. We utilize iLQR to solve the games and closed-loop simulate the interactions. We demonstrate the benefits of utilizing distributed imagined games in our framework through various simulation experiments. We show the high success rate, the increased navigation efficiency, and the ability to generate rich and realistic interactions with interpretable parameters. Illustrative examples are available at https://sites.google.com/berkeley.edu/distributed-interaction.
Abstract:This paper presents a differential geometric control approach that leverages SE(3) group invariance and equivariance to increase transferability in learning robot manipulation tasks that involve interaction with the environment. Specifically, we employ a control law and a learning representation framework that remain invariant under arbitrary SE(3) transformations of the manipulation task definition. Furthermore, the control law and learning representation framework are shown to be SE(3) equivariant when represented relative to the spatial frame. The proposed approach is based on utilizing a recently presented geometric impedance control (GIC) combined with a learning variable impedance control framework, where the gain scheduling policy is trained in a supervised learning fashion from expert demonstrations. A geometrically consistent error vector (GCEV) is fed to a neural network to achieve a gain scheduling policy that remains invariant to arbitrary translation and rotations. A comparison of our proposed control and learning framework with a well-known Cartesian space learning impedance control, equipped with a Cartesian error vector-based gain scheduling policy, confirms the significantly superior learning transferability of our proposed approach. A hardware implementation on a peg-in-hole task is conducted to validate the learning transferability and feasibility of the proposed approach.
Abstract:This paper presents a sampling-based motion planning framework that leverages the geometry of obstacles in a workspace as well as prior experiences from motion planning problems. Previous studies have demonstrated the benefits of utilizing prior solutions to motion planning problems for improving planning efficiency. However, particularly for high-dimensional systems, achieving high performance across randomized environments remains a technical challenge for experience-based approaches due to the substantial variance between each query. To address this challenge, we propose a novel approach that involves decoupling the problem into subproblems through algorithmic workspace decomposition and graph search. Additionally, we capitalize on prior experience within each subproblem. This approach effectively reduces the variance across different problems, leading to improved performance for experience-based planners. To validate the effectiveness of our framework, we conduct experiments using 2D and 6D robotic systems. The experimental results demonstrate that our framework outperforms existing algorithms in terms of planning time and cost.
Abstract:Continuous formulations of trajectory planning problems have two main benefits. First, constraints are guaranteed to be satisfied at all times. Secondly, dynamic obstacles can be naturally considered with time. This paper introduces a novel B-spline based trajectory optimization method for multi-jointed robots that provides a continuous trajectory with guaranteed continuous constraints satisfaction. At the core of this method, B-spline basic operations, like addition, multiplication, and derivative, are rigorously defined and applied for problem formulation. B-spline unique characteristics, such as the convex hull and smooth curves properties, are utilized to reformulate the original continuous optimization problem into a finite-dimensional problem. Collision avoidance with static obstacles is achieved using the signed distance field, while that with dynamic obstacles is accomplished via constructing time-varying separating hyperplanes. Simulation results on various robots validate the effectiveness of the algorithm. In addition, this paper provides experimental validations with a 6-link FANUC robot avoiding static and moving obstacles.
Abstract:Learning generalizable insertion skills in a data-efficient manner has long been a challenge in the robot learning community. While the current state-of-the-art methods with reinforcement learning (RL) show promising performance in acquiring manipulation skills, the algorithms are data-hungry and hard to generalize. To overcome the issues, in this paper we present Prim-LAfD, a simple yet effective framework to learn and adapt primitive-based insertion skills from demonstrations. Prim-LAfD utilizes black-box function optimization to learn and adapt the primitive parameters leveraging prior experiences. Human demonstrations are modeled as dense rewards guiding parameter learning. We validate the effectiveness of the proposed method on eight peg-hole and connector-socket insertion tasks. The experimental results show that our proposed framework takes less than one hour to acquire the insertion skills and as few as fifteen minutes to adapt to an unseen insertion task on a physical robot.