Abstract:Robot Imitation Learning (IL) is a crucial technique in robot learning, where agents learn by mimicking human demonstrations. However, IL encounters scalability challenges stemming from both non-user-friendly demonstration collection methods and the extensive time required to amass a sufficient number of demonstrations for effective training. In response, we introduce the Augmented Reality for Collection and generAtion of DEmonstrations (ARCADE) framework, designed to scale up demonstration collection for robot manipulation tasks. Our framework combines two key capabilities: 1) it leverages AR to make demonstration collection as simple as users performing daily tasks using their hands, and 2) it enables the automatic generation of additional synthetic demonstrations from a single human-derived demonstration, significantly reducing user effort and time. We assess ARCADE's performance on a real Fetch robot across three robotics tasks: 3-Waypoints-Reach, Push, and Pick-And-Place. Using our framework, we were able to rapidly train a policy using vanilla Behavioral Cloning (BC), a classic IL algorithm, which excelled across these three tasks. We also deploy ARCADE on a real household task, Pouring-Water, achieving an 80% success rate.
Abstract:Robot Imitation Learning (IL) is a widely used method for training robots to perform manipulation tasks that involve mimicking human demonstrations to acquire skills. However, its practicality has been limited due to its requirement that users be trained in operating real robot arms to provide demonstrations. This paper presents an innovative solution: an Augmented Reality (AR)-assisted framework for demonstration collection, empowering non-roboticist users to produce demonstrations for robot IL using devices like the HoloLens 2. Our framework facilitates scalable and diverse demonstration collection for real-world tasks. We validate our approach with experiments on three classical robotics tasks: reach, push, and pick-and-place. The real robot performs each task successfully while replaying demonstrations collected via AR.
Abstract:Human-robot interaction is now an established discipline. Dozens of HRI courses exist at universities worldwide, and some institutions even offer degrees in HRI. However, although many students are being taught HRI, there is no agreed-upon curriculum for an introductory HRI course. In this workshop, we aimed to reach community consensus on what should be covered in such a course. Through interactive activities like panels, breakout discussions, and syllabus design, workshop participants explored the many topics and pedagogical approaches for teaching HRI. This collection of articles submitted to the workshop provides examples of HRI courses being offered worldwide.
Abstract:This paper describes TOBY, a visualization tool that helps a user explore the contents of an academic survey paper. The visualization consists of four components: a hierarchical view of taxonomic data in the survey, a document similarity view in the space of taxonomic classes, a network view of citations, and a new paper recommendation tool. In this paper, we will discuss these features in the context of three separate deployments of the tool.
Abstract:As autonomous robots are deployed in increasingly complex environments, platform degradation, environmental uncertainties, and deviations from validated operation conditions can make it difficult for human partners to understand robot capabilities and limitations. The ability for a robot to self-assess its competency in dynamic and uncertain environments will be a crucial next step in successful human-robot teaming. This work presents and evaluates an Event-Triggered Generalized Outcome Assessment (ET-GOA) algorithm for autonomous agents to dynamically assess task confidence during execution. The algorithm uses a fast online statistical test of the agent's observations and its model predictions to decide when competency assessment is needed. We provide experimental results using ET-GOA to generate competency reports during a simulated delivery task and suggest future research directions for self-assessing agents.
Abstract:Large Language Models (LLMs) trained using massive text datasets have recently shown promise in generating action plans for robotic agents from high level text queries. However, these models typically do not consider the robot's environment, resulting in generated plans that may not actually be executable due to ambiguities in the planned actions or environmental constraints. In this paper, we propose an approach to generate environmentally-aware action plans that can be directly mapped to executable agent actions. Our approach involves integrating environmental objects and object relations as additional inputs into LLM action plan generation to provide the system with an awareness of its surroundings, resulting in plans where each generated action is mapped to objects present in the scene. We also design a novel scoring function that, along with generating the action steps and associating them with objects, helps the system disambiguate among object instances and take into account their states. We evaluate our approach using the VirtualHome simulator and the ActivityPrograms knowledge base. Our results show that the action plans generated from our system outperform prior work in terms of their correctness and executability by 5.3% and 8.9% respectively.
Abstract:Human-robot teams will soon be expected to accomplish complex tasks in high-risk and uncertain environments. Here, the human may not necessarily be a robotics expert, but will need to establish a baseline understanding of the robot's abilities in order to appropriately utilize and rely on the robot. This willingness to rely, also known as trust, is based partly on the human's belief in the robot's proficiency at a given task. If trust is too high, the human may push the robot beyond its capabilities. If trust is too low, the human may not utilize it when they otherwise could have, wasting precious resources. In this work, we develop and execute an online human-subjects study to investigate how robot proficiency self-assessment reports based on Factorized Machine Self-Confidence affect operator trust and task performance in a grid world navigation task. Additionally we present and analyze a metric for trust level assessment, which measures the allocation of control between an operator and robot when the human teammate is free to switch between teleportation and autonomous control. Our results show that an a priori robot self-assessment report aligns operator trust with robot proficiency, and leads to performance improvements and small increases in self-reported trust.
Abstract:Virtual, Augmented, and Mixed Reality for Human-Robot Interaction (VAM-HRI) has been gaining considerable attention in research in recent years. However, the HRI community lacks a set of shared terminology and framework for characterizing aspects of mixed reality interfaces, presenting serious problems for future research. Therefore, it is important to have a common set of terms and concepts that can be used to precisely describe and organize the diverse array of work being done within the field. In this paper, we present a novel taxonomic framework for different types of VAM-HRI interfaces, composed of four main categories of virtual design elements (VDEs). We present and justify our taxonomy and explain how its elements have been developed over the last 30 years as well as the current directions VAM-HRI is headed in the coming decade.
Abstract:In shared control, advances in autonomous robotics are applied to help empower a human user in operating a robotic system. While these systems have been shown to improve efficiency and operation success, users are not always accepting of the new control paradigm produced by working with an assistive controller. This mismatch between performance and acceptance can prevent users from taking advantage of the benefits of shared control systems for robotic operation. To address this mismatch, we develop multiple types of visualizations for improving both the legibility and perceived predictability of assistive controllers, then conduct a user study to evaluate the impact that these visualizations have on user acceptance of shared control systems. Our results demonstrate that shared control visualizations must be designed carefully to be effective, with users requiring visualizations that improve both legibility and predictability of the assistive controller in order to voluntarily relinquish control.
Abstract:RoomShift is a room-scale dynamic haptic environment for virtual reality, using a small swarm of robots that can move furniture. RoomShift consists of nine shape-changing robots: Roombas with mechanical scissor lifts. These robots drive beneath a piece of furniture to lift, move and place it. By augmenting virtual scenes with physical objects, users can sit on, lean against, place and otherwise interact with furniture with their whole body; just as in the real world. When the virtual scene changes or users navigate within it, the swarm of robots dynamically reconfigures the physical environment to match the virtual content. We describe the hardware and software implementation, applications in virtual tours and architectural design and interaction techniques.