Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Christoph Killing

Learning to Robustly Negotiate Bi-Directional Lane Usage in High-Conflict Driving Scenarios

Mar 22, 2021

Christoph Killing, Adam Villaflor, John M. Dolan

Figure 1 for Learning to Robustly Negotiate Bi-Directional Lane Usage in High-Conflict Driving Scenarios

Figure 2 for Learning to Robustly Negotiate Bi-Directional Lane Usage in High-Conflict Driving Scenarios

Figure 3 for Learning to Robustly Negotiate Bi-Directional Lane Usage in High-Conflict Driving Scenarios

Figure 4 for Learning to Robustly Negotiate Bi-Directional Lane Usage in High-Conflict Driving Scenarios

Abstract:Recently, autonomous driving has made substantial progress in addressing the most common traffic scenarios like intersection navigation and lane changing. However, most of these successes have been limited to scenarios with well-defined traffic rules and require minimal negotiation with other vehicles. In this paper, we introduce a previously unconsidered, yet everyday, high-conflict driving scenario requiring negotiations between agents of equal rights and priorities. There exists no centralized control structure and we do not allow communications. Therefore, it is unknown if other drivers are willing to cooperate, and if so to what extent. We train policies to robustly negotiate with opposing vehicles of an unobservable degree of cooperativeness using multi-agent reinforcement learning (MARL). We propose Discrete Asymmetric Soft Actor-Critic (DASAC), a maximum-entropy off-policy MARL algorithm allowing for centralized training with decentralized execution. We show that using DASAC we are able to successfully negotiate and traverse the scenario considered over 99% of the time. Our agents are robust to an unknown timing of opponent decisions, an unobservable degree of cooperativeness of the opposing vehicle, and previously unencountered policies. Furthermore, they learn to exhibit human-like behaviors such as defensive driving, anticipating solution options and interpreting the behavior of other agents.

* 7 pages, 7 figures

Via

Access Paper or Ask Questions