Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:SAFE-RL: Saliency-Aware Counterfactual Explainer for Deep Reinforcement Learning Policies

Apr 28, 2024

Amir Samadi, Konstantinos Koufos, Kurt Debattista, Mehrdad Dianati

Figure 1 for SAFE-RL: Saliency-Aware Counterfactual Explainer for Deep Reinforcement Learning Policies

Figure 2 for SAFE-RL: Saliency-Aware Counterfactual Explainer for Deep Reinforcement Learning Policies

Figure 3 for SAFE-RL: Saliency-Aware Counterfactual Explainer for Deep Reinforcement Learning Policies

Figure 4 for SAFE-RL: Saliency-Aware Counterfactual Explainer for Deep Reinforcement Learning Policies

Share this with someone who'll enjoy it:

Abstract:While Deep Reinforcement Learning (DRL) has emerged as a promising solution for intricate control tasks, the lack of explainability of the learned policies impedes its uptake in safety-critical applications, such as automated driving systems (ADS). Counterfactual (CF) explanations have recently gained prominence for their ability to interpret black-box Deep Learning (DL) models. CF examples are associated with minimal changes in the input, resulting in a complementary output by the DL model. Finding such alternations, particularly for high-dimensional visual inputs, poses significant challenges. Besides, the temporal dependency introduced by the reliance of the DRL agent action on a history of past state observations further complicates the generation of CF examples. To address these challenges, we propose using a saliency map to identify the most influential input pixels across the sequence of past observed states by the agent. Then, we feed this map to a deep generative model, enabling the generation of plausible CFs with constrained modifications centred on the salient regions. We evaluate the effectiveness of our framework in diverse domains, including ADS, Atari Pong, Pacman and space-invaders games, using traditional performance metrics such as validity, proximity and sparsity. Experimental results demonstrate that this framework generates more informative and plausible CFs than the state-of-the-art for a wide range of environments and DRL agents. In order to foster research in this area, we have made our datasets and codes publicly available at https://github.com/Amir-Samadi/SAFE-RL.

View paper on

Share this with someone who'll enjoy it:

Title:SAFE-RL: Saliency-Aware Counterfactual Explainer for Deep Reinforcement Learning Policies

Paper and Code