Demo
in
Workshop: XAI in Action: Past, Present, and Future Applications
Utilizing Explainability Techniques for Reinforcement Learning Model Assurance
Alexander Tapley
Explainable Reinforcement Learning (XRL) can provide transparency into the decision-making process of a Reinforcement Learning (RL) model and increase user trust and adoption into real-world use cases. By utilizing XRL techniques, researchers can identify potential vulnerabilities within a trained RL model prior to deployment, therefore limiting the potential for mission failure or mistakes by the system. This paper introduces the ARLIN (Assured RL Model Interrogation) Toolkit, a Python library that provides explainability outputs for trained RL models that can be used to identify potential policy vulnerabilities and critical points. Using XRL datasets, ARLIN provides detailed analysis into an RL model's latent space, creates a semi-aggregated Markov decision process (SAMDP) to outline the model's path throughout an episode, and produces cluster analytics for each node within the SAMDP to identify potential failure points and vulnerabilities within the model. To illustrate ARLIN's effectiveness, we provide sample API usage and corresponding explainability visualizations and vulnerability point detection for a publicly available RL model. The open-source code repository is available for download at (GitHub link forthcoming).