Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dexter R. R. Scobee

LESS is More: Rethinking Probabilistic Models of Human Behavior

Jan 13, 2020

Andreea Bobu, Dexter R. R. Scobee, Jaime F. Fisac, S. Shankar Sastry, Anca D. Dragan

Figure 1 for LESS is More: Rethinking Probabilistic Models of Human Behavior

Figure 2 for LESS is More: Rethinking Probabilistic Models of Human Behavior

Figure 3 for LESS is More: Rethinking Probabilistic Models of Human Behavior

Figure 4 for LESS is More: Rethinking Probabilistic Models of Human Behavior

Abstract:Robots need models of human behavior for both inferring human goals and preferences, and predicting what people will do. A common model is the Boltzmann noisily-rational decision model, which assumes people approximately optimize a reward function and choose trajectories in proportion to their exponentiated reward. While this model has been successful in a variety of robotics domains, its roots lie in econometrics, and in modeling decisions among different discrete options, each with its own utility or reward. In contrast, human trajectories lie in a continuous space, with continuous-valued features that influence the reward function. We propose that it is time to rethink the Boltzmann model, and design it from the ground up to operate over such trajectory spaces. We introduce a model that explicitly accounts for distances between trajectories, rather than only their rewards. Rather than each trajectory affecting the decision independently, similar trajectories now affect the decision together. We start by showing that our model better explains human behavior in a user study. We then analyze the implications this has for robot inference, first in toy environments where we have ground truth and find more accurate inference, and finally for a 7DOF robot arm learning from user demonstrations.

* 9 pages, 7 figures

Via

Access Paper or Ask Questions

Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning

Sep 12, 2019

Dexter R. R. Scobee, S. Shankar Sastry

Figure 1 for Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning

Figure 2 for Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning

Figure 3 for Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning

Abstract:While most approaches to the problem of Inverse Reinforcement Learning (IRL) focus on estimating a reward function that best explains an expert agent's policy or demonstrated behavior on a control task, it is often the case that such behavior is more succinctly described by a simple reward combined with a set of hard constraints. In this setting, the agent is attempting to maximize cumulative rewards subject to these given constraints on their behavior. We reformulate the problem of IRL on Markov Decision Processes (MDPs) such that, given a nominal model of the environment and a nominal reward function, we seek to estimate state, action, and feature constraints in the environment that motivate an agent's behavior. Our approach is based on the Maximum Entropy IRL framework, which allows us to reason about the likelihood of an expert agent's demonstrations given our knowledge of an MDP. Using our method, we can infer which constraints can be added to the MDP to most increase the likelihood of observing these demonstrations. We present an algorithm which iteratively infers the Maximum Likelihood Constraint to best explain observed behavior, and we evaluate its efficacy using both simulated behavior and recorded data of humans navigating around an obstacle.

Via

Access Paper or Ask Questions

Modeling Supervisor Safe Sets for Improving Collaboration in Human-Robot Teams

May 09, 2018

David L. McPherson, Dexter R. R. Scobee, Joseph Menke, Allen Y. Yang, S. Shankar Sastry

Figure 1 for Modeling Supervisor Safe Sets for Improving Collaboration in Human-Robot Teams

Figure 2 for Modeling Supervisor Safe Sets for Improving Collaboration in Human-Robot Teams

Figure 3 for Modeling Supervisor Safe Sets for Improving Collaboration in Human-Robot Teams

Figure 4 for Modeling Supervisor Safe Sets for Improving Collaboration in Human-Robot Teams

Abstract:When a human supervisor collaborates with a team of robots, their attention is divided and cognitive resources are at a premium. We aim to optimize the distribution of these resources and the flow of attention. To this end, we propose the model of an idealized supervisor to describe human behavior. Such a supervisor employs a potentially inaccurate internal model of the the robots' dynamics to judge safety. We represent these safety judgements by constructing a safe set from this internal model using reachability theory. When a robot leaves this safe set, the idealized supervisor will intervene to assist, regardless of whether or not the robot remains objectively safe. False positives, where a human supervisor incorrectly judges a robot to be in danger, needlessly consume supervisor attention. In this work, we propose a method that decreases false positives by learning the supervisor's safe set and using that information to govern robot behavior. We prove that robots behaving according to our approach will reduce the occurrence of false positives for our idealized supervisor model. Furthermore, we empirically validate our approach with a user study that demonstrates a significant ($p = 0.0328$) reduction in false positives for our method compared to a baseline safety controller.

Via

Access Paper or Ask Questions