Abstract:We present Q-Cogni, an algorithmically integrated causal reinforcement learning framework that redesigns Q-Learning with an autonomous causal structure discovery method to improve the learning process with causal inference. Q-Cogni achieves optimal learning with a pre-learned structural causal model of the environment that can be queried during the learning process to infer cause-and-effect relationships embedded in a state-action space. We leverage on the sample efficient techniques of reinforcement learning, enable reasoning about a broader set of policies and bring higher degrees of interpretability to decisions made by the reinforcement learning agent. We apply Q-Cogni on the Vehicle Routing Problem (VRP) and compare against state-of-the-art reinforcement learning algorithms. We report results that demonstrate better policies, improved learning efficiency and superior interpretability of the agent's decision making. We also compare this approach with traditional shortest-path search algorithms and demonstrate the benefits of our causal reinforcement learning framework to high dimensional problems. Finally, we apply Q-Cogni to derive optimal routing decisions for taxis in New York City using the Taxi & Limousine Commission trip record data and compare with shortest-path search, reporting results that show 85% of the cases with an equal or better policy derived from Q-Cogni in a real-world domain.