Abstract:Previous methods for Learning from Demonstration leverage several approaches for a human to teach motions to a robot, including teleoperation, kinesthetic teaching, and natural demonstrations. However, little previous work has explored more general interfaces that allow for multiple demonstration types. Given the varied preferences of human demonstrators and task characteristics, a flexible tool that enables multiple demonstration types could be crucial for broader robot skill training. In this work, we propose Versatile Demonstration Interface (VDI), an attachment for collaborative robots that simplifies the collection of three common types of demonstrations. Designed for flexible deployment in industrial settings, our tool requires no additional instrumentation of the environment. Our prototype interface captures human demonstrations through a combination of vision, force sensing, and state tracking (e.g., through the robot proprioception or AprilTag tracking). Through a user study where we deployed our prototype VDI at a local manufacturing innovation center with manufacturing experts, we demonstrated the efficacy of our prototype in representative industrial tasks. Interactions from our study exposed a range of industrial use cases for VDI, clear relationships between demonstration preferences and task criteria, and insights for future tool design.
Abstract:We introduce GeoSACS, a geometric framework for shared autonomy (SA). In variable environments, SA methods can be used to combine robotic capabilities with real-time human input in a way that offloads the physical task from the human. To remain intuitive, it can be helpful to simplify requirements for human input (i.e., reduce the dimensionality), which create challenges for to map low-dimensional human inputs to the higher dimensional control space of robots without requiring large amounts of data. We built GeoSACS on canal surfaces, a geometric framework that represents potential robot trajectories as a canal from as few as two demonstrations. GeoSACS maps user corrections on the cross-sections of this canal to provide an efficient SA framework. We extend canal surfaces to consider orientation and update the control frames to support intuitive mapping from user input to robot motions. Finally, we demonstrate GeoSACS in two preliminary studies, including a complex manipulation task where a robot loads laundry into a washer.
Abstract:Grounding the common-sense reasoning of Large Language Models in physical domains remains a pivotal yet unsolved problem for embodied AI. Whereas prior works have focused on leveraging LLMs directly for planning in symbolic spaces, this work uses LLMs to guide the search of task structures and constraints implicit in multi-step demonstrations. Specifically, we borrow from manipulation planning literature the concept of mode families, which group robot configurations by specific motion constraints, to serve as an abstraction layer between the high-level language representations of an LLM and the low-level physical trajectories of a robot. By replaying a few human demonstrations with synthetic perturbations, we generate coverage over the demonstrations' state space with additional successful executions as well as counterfactuals that fail the task. Our explanation-based learning framework trains an end-to-end differentiable neural network to predict successful trajectories from failures and as a by-product learns classifiers that ground low-level states and images in mode families without dense labeling. The learned grounding classifiers can further be used to translate language plans into reactive policies in the physical domain in an interpretable manner. We show our approach improves the interpretability and reactivity of imitation learning through 2D navigation and simulated and real robot manipulation tasks. Website: https://sites.google.com/view/grounding-plans
Abstract:Many industrial tasks-such as sanding, installing fasteners, and wire harnessing-are difficult to automate due to task complexity and variability. We instead investigate deploying robots in an assistive role for these tasks, where the robot assumes the physical task burden and the skilled worker provides both the high-level task planning and low-level feedback necessary to effectively complete the task. In this article, we describe the development of a system for flexible human-robot teaming that combines state-of-the-art methods in end-user programming and shared autonomy and its implementation in sanding applications. We demonstrate the use of the system in two types of sanding tasks, situated in aircraft manufacturing, that highlight two potential workflows within the human-robot teaming setup. We conclude by discussing challenges and opportunities in human-robot teaming identified during the development, application, and demonstration of our system.
Abstract:Handheld kinesthetic haptic interfaces can provide greater mobility and richer tactile information as compared to traditional grounded devices. In this paper, we introduce a new handheld haptic interface which takes input using bidirectional coupled finger flexion. We present the device design motivation and design details and experimentally evaluate its performance in terms of transparency and rendering bandwidth using a handheld prototype device. In addition, we assess the device's functional performance through a user study comparing the proposed device to a commonly used grounded input device in a set of targeting and tracking tasks.
Abstract:Shared autonomy methods, where a human operator and a robot arm work together, have enabled robots to complete a range of complex and highly variable tasks. Existing work primarily focuses on one human sharing autonomy with a single robot. By contrast, in this paper we present an approach for multi-robot shared autonomy that enables one operator to provide real-time corrections across two coordinated robots completing the same task in parallel. Sharing autonomy with multiple robots presents fundamental challenges. The human can only correct one robot at a time, and without coordination, the human may be left idle for long periods of time. Accordingly, we develop an approach that aligns the robot's learned motions to best utilize the human's expertise. Our key idea is to leverage Learning from Demonstration (LfD) and time warping to schedule the motions of the robots based on when they may require assistance. Our method uses variability in operator demonstrations to identify the types of corrections an operator might apply during shared autonomy, leverages flexibility in how quickly the task was performed in demonstrations to aid in scheduling, and iteratively estimates the likelihood of when corrections may be needed to ensure that only one robot is completing an action requiring assistance. Through a preliminary simulated study, we show that our method can decrease the overall time spent sanding by iteratively estimating the times when each robot could need assistance and generating an optimized schedule that allows the operator to provide corrections to each robot during these times.
Abstract:Drones can provide a minimally-constrained adapting camera view to support robot telemanipulation. Furthermore, the drone view can be automated to reduce the burden on the operator during teleoperation. However, existing approaches do not focus on two important aspects of using a drone as an automated view provider. The first is how the drone should select from a range of quality viewpoints within the workspace (e.g., opposite sides of an object). The second is how to compensate for unavoidable drone pose uncertainty in determining the viewpoint. In this paper, we provide a nonlinear optimization method that yields effective and adaptive drone viewpoints for telemanipulation with an articulated manipulator. Our first key idea is to use sparse human-in-the-loop input to toggle between multiple automatically-generated drone viewpoints. Our second key idea is to introduce optimization objectives that maintain a view of the manipulator while considering drone uncertainty and the impact on viewpoint occlusion and environment collisions. We provide an instantiation of our drone viewpoint method within a drone-manipulator remote teleoperation system. Finally, we provide an initial validation of our method in tasks where we complete common household and industrial manipulations.
Abstract:Remotely programming robots to execute tasks often relies on registering objects of interest in the robot's environment. Frequently, these tasks involve articulating objects such as opening or closing a valve. However, existing human-in-the-loop methods for registering objects do not consider articulations and the corresponding impact to the geometry of the object, which can cause the methods to fail. In this work, we present an approach where the registration system attempts to automatically determine the object model, pose, and articulation for user-selected points using a nonlinear iterative closest point algorithm. When the automated fitting is incorrect, the operator can iteratively intervene with corrections after which the system will refit the object. We present an implementation of our fitting procedure and evaluate it with a user study that shows that it can improve user performance, in measures of time on task and task load, ease of use, and usefulness compared to a manual registration approach. We also present a situated example that demonstrates the integration of our method in an end-to-end system for articulating a remote valve.
Abstract:Affordance Templates (ATs) are a method for parameterizing objects for autonomous robot manipulations. In this approach, instances of an object are registered by positioning a model in a 3D environment, which requires a large amount of user input. We instead propose a registration method which combines autonomy and user corrections. For selected objects, the system determines both the model and corresponding pose autonomously. The user makes corrections only if the model or pose is incorrect. This method increases the level of autonomy compared to existing approaches which can reduce user input and time on task. In this paper, we present an overview of existing methods, a description of our method, preliminary results, and planned future work.
Abstract:Remote teleoperation of robots can broaden the reach of domain specialists across a wide range of industries such as home maintenance, health care, light manufacturing, and construction. However, current direct control methods are impractical, and existing tools for programming robot remotely have focused on users with significant robotic experience. Extending robot remote programming to end users, i.e., users who are experts in a domain but novices in robotics, requires tools that balance the rich features necessary for complex teleoperation tasks with ease of use. The primary challenge to usability is that novice users are unable to specify complete and robust task plans to allow a robot to perform duties autonomously, particularly in highly variable environments. Our solution is to allow operators to specify shorter sequences of high-level commands, which we call task-level authoring, to create periods of variable robot autonomy. This approach allows inexperienced users to create robot behaviors in uncertain environments by interleaving exploration, specification of behaviors, and execution as separate steps. End users are able to break down the specification of tasks and adapt to the current needs of the interaction and environments, combining the reactivity of direct control to asynchronous operation. In this paper, we describe a prototype system contextualized in light manufacturing and its empirical validation in a user study where 18 participants with some programming experience were able to perform a variety of complex telemanipulation tasks with little training. Our results show that our approach allowed users to create flexible periods of autonomy and solve rich manipulation tasks. Furthermore, participants significantly preferred our system over comparative more direct interfaces, demonstrating the potential of our approach.