Abstract:This paper introduces a new method for safety-aware robot learning, focusing on repairing policies using predictive models. Our method combines behavioral cloning with neural network repair in a two-step supervised learning framework. It first learns a policy from expert demonstrations and then applies repair subject to predictive models to enforce safety constraints. The predictive models can encompass various aspects relevant to robot learning applications, such as proprioceptive states and collision likelihood. Our experimental results demonstrate that the learned policy successfully adheres to a predefined set of safety constraints on two applications: mobile robot navigation, and real-world lower-leg prostheses. Additionally, we have shown that our method effectively reduces repeated interaction with the robot, leading to substantial time savings during the learning process.
Abstract:Guaranteeing safety in human-centric applications is critical in robot learning as the learned policies may demonstrate unsafe behaviors in formerly unseen scenarios. We present a framework to locally repair an erroneous policy network to satisfy a set of formal safety constraints using Mixed Integer Quadratic Programming (MIQP). Our MIQP formulation explicitly imposes the safety constraints to the learned policy while minimizing the original loss function. The policy network is then verified to be locally safe. We demonstrate the application of our framework to derive safe policies for a robotic lower-leg prosthesis.
Abstract:Assistive robotic devices are a particularly promising field of application for neural networks (NN) due to the need for personalization and hard-to-model human-machine interaction dynamics. However, NN based estimators and controllers may produce potentially unsafe outputs over previously unseen data points. In this paper, we introduce an algorithm for updating NN control policies to satisfy a given set of formal safety constraints, while also optimizing the original loss function. Given a set of mixed-integer linear constraints, we define the NN repair problem as a Mixed Integer Quadratic Program (MIQP). In extensive experiments, we demonstrate the efficacy of our repair method in generating safe policies for a lower-leg prosthesis.
Abstract:In this paper, we propose a framework to repair a pre-trained feed-forward neural network (NN) to satisfy a set of properties. We formulate the properties as a set of predicates that impose constraints on the output of NN over the target input domain. We define the NN repair problem as a Mixed Integer Quadratic Program (MIQP) to adjust the weights of a single layer subject to the given predicates while minimizing the original loss function over the original training domain. We demonstrate the application of our framework in bounding an affine transformation, correcting an erroneous NN in classification, and bounding the inputs of a NN controller.
Abstract:The increasing deployment of robots in co-working scenarios with humans has revealed complex safety and efficiency challenges in the computation robot behavior. Movement among humans is one of the most fundamental -- and yet critical -- problems in this frontier. While several approaches have addressed this problem from a purely navigational point of view, the absence of a unified paradigm for communicating with humans limits their ability to prevent deadlocks and compute feasible solutions. This paper presents a joint communication and motion planning framework that selects from an arbitrary input set of robot's communication signals while computing robot motion plans. It models a human co-worker's imperfect perception of these communications using a noisy sensor model and facilitates the specification of a variety of social/workplace compliance priorities with a flexible cost function. Theoretical results and simulator-based empirical evaluations show that our approach efficiently computes motion plans and communication strategies that reduce conflicts between agents and resolve potential deadlocks.
Abstract:Sampling-based methods such as Rapidly-exploring Random Trees (RRTs) have been widely used for generating motion paths for autonomous mobile systems. In this work, we extend time-based RRTs with Control Barrier Functions (CBFs) to generate, safe motion plans in dynamic environments with many pedestrians. Our framework is based upon a human motion prediction model which is well suited for indoor narrow environments. We demonstrate our approach on a high-fidelity model of the Toyota Human Support Robot navigating in narrow corridors. We show in three scenarios that our proposed online method can navigate safely in the presence of moving agents with unknown dynamics.