Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Beyond Value: CHECKLIST for Testing Inferences in Planning-Based RL

Jun 07, 2022

Kin-Ho Lam, Delyar Tabatabai, Jed Irvine, Donald Bertucci, Anita Ruangrotsakun, Minsuk Kahng, Alan Fern

Figure 1 for Beyond Value: CHECKLIST for Testing Inferences in Planning-Based RL

Figure 2 for Beyond Value: CHECKLIST for Testing Inferences in Planning-Based RL

Figure 3 for Beyond Value: CHECKLIST for Testing Inferences in Planning-Based RL

Figure 4 for Beyond Value: CHECKLIST for Testing Inferences in Planning-Based RL

Share this with someone who'll enjoy it:

Abstract:Reinforcement learning (RL) agents are commonly evaluated via their expected value over a distribution of test scenarios. Unfortunately, this evaluation approach provides limited evidence for post-deployment generalization beyond the test distribution. In this paper, we address this limitation by extending the recent CheckList testing methodology from natural language processing to planning-based RL. Specifically, we consider testing RL agents that make decisions via online tree search using a learned transition model and value function. The key idea is to improve the assessment of future performance via a CheckList approach for exploring and assessing the agent's inferences during tree search. The approach provides the user with an interface and general query-rule mechanism for identifying potential inference flaws and validating expected inference invariances. We present a user study involving knowledgeable AI researchers using the approach to evaluate an agent trained to play a complex real-time strategy game. The results show the approach is effective in allowing users to identify previously-unknown flaws in the agent's reasoning. In addition, our analysis provides insight into how AI experts use this type of testing approach, which may help improve future instantiations.

* This work will appear in the Proceedings of the 32nd International Conference on Automated Planning and Scheduling (ICAPS2022) https://icaps22.icaps-conference.org/papers

View paper on

Share this with someone who'll enjoy it:

Title:Beyond Value: CHECKLIST for Testing Inferences in Planning-Based RL

Paper and Code