Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Arie Glazier

Learning Behavioral Soft Constraints from Demonstrations

Feb 21, 2022

Arie Glazier, Andrea Loreggia, Nicholas Mattei, Taher Rahgooy, Francesca Rossi, Brent Venable

Figure 1 for Learning Behavioral Soft Constraints from Demonstrations

Figure 2 for Learning Behavioral Soft Constraints from Demonstrations

Figure 3 for Learning Behavioral Soft Constraints from Demonstrations

Figure 4 for Learning Behavioral Soft Constraints from Demonstrations

Abstract:Many real-life scenarios require humans to make difficult trade-offs: do we always follow all the traffic rules or do we violate the speed limit in an emergency? These scenarios force us to evaluate the trade-off between collective rules and norms with our own personal objectives and desires. To create effective AI-human teams, we must equip AI agents with a model of how humans make these trade-offs in complex environments when there are implicit and explicit rules and constraints. Agent equipped with these models will be able to mirror human behavior and/or to draw human attention to situations where decision making could be improved. To this end, we propose a novel inverse reinforcement learning (IRL) method: Max Entropy Inverse Soft Constraint IRL (MESC-IRL), for learning implicit hard and soft constraints over states, actions, and state features from demonstrations in deterministic and non-deterministic environments modeled as Markov Decision Processes (MDPs). Our method enables agents implicitly learn human constraints and desires without the need for explicit modeling by the agent designer and to transfer these constraints between environments. Our novel method generalizes prior work which only considered deterministic hard constraints and achieves state of the art performance.

* arXiv admin note: substantial text overlap with arXiv:2109.11018

Via

Access Paper or Ask Questions

Making Human-Like Trade-offs in Constrained Environments by Learning from Demonstrations

Sep 22, 2021

Arie Glazier, Andrea Loreggia, Nicholas Mattei, Taher Rahgooy, Francesca Rossi, K. Brent Venable

Figure 1 for Making Human-Like Trade-offs in Constrained Environments by Learning from Demonstrations

Figure 2 for Making Human-Like Trade-offs in Constrained Environments by Learning from Demonstrations

Figure 3 for Making Human-Like Trade-offs in Constrained Environments by Learning from Demonstrations

Figure 4 for Making Human-Like Trade-offs in Constrained Environments by Learning from Demonstrations

Abstract:Many real-life scenarios require humans to make difficult trade-offs: do we always follow all the traffic rules or do we violate the speed limit in an emergency? These scenarios force us to evaluate the trade-off between collective norms and our own personal objectives. To create effective AI-human teams, we must equip AI agents with a model of how humans make trade-offs in complex, constrained environments. These agents will be able to mirror human behavior or to draw human attention to situations where decision making could be improved. To this end, we propose a novel inverse reinforcement learning (IRL) method for learning implicit hard and soft constraints from demonstrations, enabling agents to quickly adapt to new settings. In addition, learning soft constraints over states, actions, and state features allows agents to transfer this knowledge to new domains that share similar aspects. We then use the constraint learning method to implement a novel system architecture that leverages a cognitive model of human decision making, multi-alternative decision field theory (MDFT), to orchestrate competing objectives. We evaluate the resulting agent on trajectory length, number of violated constraints, and total reward, demonstrating that our agent architecture is both general and achieves strong performance. Thus we are able to capture and replicate human-like trade-offs from demonstrations in environments when constraints are not explicit.

Via

Access Paper or Ask Questions