Stanford University
Abstract:Imitation learning (IL) has shown great success in learning complex robot manipulation tasks. However, there remains a need for practical safety methods to justify widespread deployment. In particular, it is important to certify that a system obeys hard constraints on unsafe behavior in settings when it is unacceptable to design a tradeoff between performance and safety via tuning the policy (i.e. soft constraints). This leads to the question, how does enforcing hard constraints impact the performance (meaning safely completing tasks) of an IL policy? To answer this question, this paper builds a reachability-based safety filter to enforce hard constraints on IL, which we call Reachability-Aided Imitation Learning (RAIL). Through evaluations with state-of-the-art IL policies in mobile robots and manipulation tasks, we make two key findings. First, the highest-performing policies are sometimes only so because they frequently violate constraints, and significantly lose performance under hard constraints. Second, surprisingly, hard constraints on the lower-performing policies can occasionally increase their ability to perform tasks safely. Finally, hardware evaluation confirms the method can operate in real time.
Abstract:This study addresses the challenge of social bipedal navigation in a dynamic, human-crowded environment, a research area largely underexplored in legged robot navigation. We present a zonotope-based framework that couples prediction and motion planning for a bipedal ego-agent to account for bidirectional influence with the surrounding pedestrians. This framework incorporates a Social Zonotope Network (SZN), a neural network that predicts future pedestrian reachable sets and plans future socially acceptable reachable set for the ego-agent. SZN generates the reachable sets as zonotopes for efficient reachability-based planning, collision checking, and online uncertainty parameterization. Locomotion-specific losses are added to the SZN training process to adhere to the dynamic limits of the bipedal robot that are not explicitly present in the human crowds data set. These loss functions enable the SZN to generate locomotion paths that are more dynamically feasible for improved tracking. SZN is integrated with a Model Predictive Controller (SZN-MPC) for footstep planning for our bipedal robot Digit. SZN-MPC solves for collision-free trajectory by optimizing through SZN's gradients. and Our results demonstrate the framework's effectiveness in producing a socially acceptable path, with consistent locomotion velocity, and optimality. The SZN-MPC framework is validated with extensive simulations and hardware experiments.
Abstract:The past few years have seen immense progress on two fronts that are critical to safe, widespread mobile robot deployment: predicting uncertain motion of multiple agents, and planning robot motion under uncertainty. However, the numerical methods required on each front have resulted in a mismatch of representation for prediction and planning. In prediction, numerical tractability is usually achieved by coarsely discretizing time, and by representing multimodal multi-agent interactions as distributions with infinite support. On the other hand, safe planning typically requires very fine time discretization, paired with distributions with compact support, to reduce conservativeness and ensure numerical tractability. The result is, when existing predictors are coupled with planning and control, one may often find unsafe motion plans. This paper proposes ZAPP (Zonotope Agreement of Prediction and Planning) to resolve the representation mismatch. ZAPP unites a prediction-friendly coarse time discretization and a planning-friendly zonotope uncertainty representation; the method also enables differentiating through a zonotope collision check, allowing one to integrate prediction and planning within a gradient-based optimization framework. Numerical examples show how ZAPP can produce safer trajectories compared to baselines in interactive scenes.
Abstract:This study addresses the challenge of bipedal navigation in a dynamic human-crowded environment, a research area that remains largely underexplored in the field of legged navigation. We propose two cascaded zonotope-based neural networks: a Pedestrian Prediction Network (PPN) for pedestrians' future trajectory prediction and an Ego-agent Social Network (ESN) for ego-agent social path planning. Representing future paths as zonotopes allows for efficient reachability-based planning and collision checking. The ESN is then integrated with a Model Predictive Controller (ESN-MPC) for footstep planning for our bipedal robot Digit designed by Agility Robotics. ESN-MPC solves for a collision-free optimal trajectory by optimizing through the gradients of ESN. ESN-MPC optimal trajectory is sent to the low-level controller for full-order simulation of Digit. The overall proposed framework is validated with extensive simulations on randomly generated initial settings with varying human crowd densities.
Abstract:Robots require a semantic understanding of their surroundings to operate in an efficient and explainable way in human environments. In the literature, there has been an extensive focus on object labeling and exhaustive scene graph generation; less effort has been focused on the task of purely identifying and mapping large semantic regions. The present work proposes a method for semantic region mapping via embodied navigation in indoor environments, generating a high-level representation of the knowledge of the agent. To enable region identification, the method uses a vision-to-language model to provide scene information for mapping. By projecting egocentric scene understanding into the global frame, the proposed method generates a semantic map as a distribution over possible region labels at each location. This mapping procedure is paired with a trained navigation policy to enable autonomous map generation. The proposed method significantly outperforms a variety of baselines, including an object-based system and a pretrained scene classifier, in experiments in a photorealistic simulator.
Abstract:Autonomous mobile robots must maintain safety, but should not sacrifice performance, leading to the classical reach-avoid problem. This paper seeks to compute trajectory plans for which a robot is guaranteed to reach a goal and avoid obstacles in the specific near-danger case that the obstacles and goal are near each other. The proposed method builds off of a common approach of using a simplified planning model to generate plans, which are then tracked using a high-fidelity tracking model and controller. Existing safe planning approaches use reachability analysis to overapproximate the error between these models, but this introduces additional numerical approximation error and thereby conservativeness that prevents goal-reaching. The present work instead proposes a Piecewise Affine Reach-avoid Computation (PARC) method to tightly approximate the reachable set of the planning model. With PARC, the main source of conservativeness is the model mismatch, which can be mitigated by careful controller and planning model design. The utility of this method is demonstrated through extensive numerical experiments in which PARC outperforms state-of-the-art reach-avoid methods in near-danger goal-reaching. Furthermore, in a simulated demonstration, PARC enables the generation of provably-safe extreme vehicle dynamics drift parking maneuvers.
Abstract:Connected autonomous vehicles (CAVs) promise to enhance safety, efficiency, and sustainability in urban transportation. However, this is contingent upon a CAV correctly predicting the motion of surrounding agents and planning its own motion safely. Doing so is challenging in complex urban environments due to frequent occlusions and interactions among many agents. One solution is to leverage smart infrastructure to augment a CAV's situational awareness; the present work leverages a recently proposed "Self-Supervised Traffic Advisor" (SSTA) framework of smart sensors that teach themselves to generate and broadcast useful video predictions of road users. In this work, SSTA predictions are modified to predict future occupancy instead of raw video, which reduces the data footprint of broadcast predictions. The resulting predictions are used within a planning framework, demonstrating that this design can effectively aid CAV motion planning. A variety of numerical experiments study the key factors that make SSTA outputs useful for practical CAV planning in crowded urban environments.
Abstract:A key challenge to the widespread deployment of robotic manipulators is the need to ensure safety in arbitrary environments while generating new motion plans in real-time. In particular, one must ensure that a manipulator does not collide with obstacles, collide with itself, or exceed its joint torque limits. This challenge is compounded by the need to account for uncertainty in the mass and inertia of manipulated objects, and potentially the robot itself. The present work addresses this challenge by proposing Autonomous Robust Manipulation via Optimization with Uncertainty-aware Reachability (ARMOUR), a provably-safe, receding-horizon trajectory planner and tracking controller framework for serial link manipulators. ARMOUR works by first constructing a robust, passivity-based controller that is proven to enable a manipulator to track desired trajectories with bounded error despite uncertain dynamics. Next, ARMOUR uses a novel variation on the Recursive Newton-Euler Algorithm (RNEA) to compute the set of all possible inputs required to track any trajectory within a continuum of desired trajectories. Finally, the method computes an over-approximation to the swept volume of the manipulator; this enables one to formulate an optimization problem, which can be solved in real-time, to synthesize provably-safe motion. The proposed method is compared to state of the art methods and demonstrated on a variety of challenging manipulation examples in simulation and on real hardware, such as maneuvering a dumbbell with uncertain mass around obstacles.
Abstract:Unlike many urban localization methods that return point-valued estimates, a set-valued representation enables robustness by ensuring that a continuum of possible positions obeys safety constraints. One strategy with the potential for set-valued estimation is GNSS-based shadow matching~(SM), where one uses a three-dimensional (3-D) map to compute GNSS shadows (where line-of-sight is blocked). However, SM requires a point-valued grid for computational tractability, with accuracy limited by grid resolution. We propose zonotope shadow matching (ZSM) for set-valued 3-D map-aided GNSS localization. ZSM represents buildings and GNSS shadows using constrained zonotopes, a convex polytope representation that enables propagating set-valued estimates using fast vector concatenation operations. Starting from a coarse set-valued position, ZSM refines the estimate depending on the receiver being inside or outside each shadow as judged by received carrier-to-noise density. We demonstrated our algorithm's performance using simulated experiments on a simple 3-D example map and on a dense 3-D map of San Francisco.
Abstract:Reinforcement learning (RL) is capable of sophisticated motion planning and control for robots in uncertain environments. However, state-of-the-art deep RL approaches typically lack safety guarantees, especially when the robot and environment models are unknown. To justify widespread deployment, robots must respect safety constraints without sacrificing performance. Thus, we propose a Black-box Reachability-based Safety Layer (BRSL) with three main components: (1) data-driven reachability analysis for a black-box robot model, (2) a trajectory rollout planner that predicts future actions and observations using an ensemble of neural networks trained online, and (3) a differentiable polytope collision check between the reachable set and obstacles that enables correcting unsafe actions. In simulation, BRSL outperforms other state-of-the-art safe RL methods on a Turtlebot 3, a quadrotor, and a trajectory-tracking point mass with an unsafe set adjacent to the area of highest reward.