Abstract:Complex sequential decision-making planning problems, covering infinite states' space have been shown to be solvable by AlphaZero type of algorithms. Such an approach that trains a neural model while simulating projection of futures with a Monte Carlo Tree Search algorithm were shown to be applicable to real life planning problems. As such, engineers and users interacting with the resulting policy of behavior might benefit from obtaining automated explanations about these planners' decisions offline or online. This paper focuses on the information within the Monte Carlo Tree Search data structure. Given its construction, this information contains much of the reasoning of the sequential decision-making algorithm and is essential for its explainability. We show novel methods using information theoretic tools for the simplification and reduction of the Monte Carlo Tree Search and the extraction of information. Such information can be directly used for the construction of human understandable explanations. We show that basic explainability quantities can be calculated with limited additional computational cost, as an integrated part of the Monte Carlo Tree Search construction process. We focus on the theoretical and algorithmic aspects and provide examples of how the methods presented here can be used in the construction of human understandable explanations.
Abstract:We present a novel framework for causal explanations of stochastic, sequential decision-making systems. Building on the well-studied structural causal model paradigm for causal reasoning, we show how to identify semantically distinct types of explanations for agent actions using a single unified approach. We provide results on the generality of this framework, run time bounds, and offer several approximate techniques. Finally, we discuss several qualitative scenarios that illustrate the framework's flexibility and efficacy.
Abstract:Multi-agent planning in stochastic environments can be framed formally as a decentralized Markov decision problem. Many real-life distributed problems that arise in manufacturing, multi-robot coordination and information gathering scenarios can be formalized using this framework. However, finding the optimal solution in the general case is hard, limiting the applicability of recently developed algorithms. This paper provides a practical approach for solving decentralized control problems when communication among the decision makers is possible, but costly. We develop the notion of communication-based mechanism that allows us to decompose a decentralized MDP into multiple single-agent problems. In this framework, referred to as decentralized semi-Markov decision process with direct communication (Dec-SMDP-Com), agents operate separately between communications. We show that finding an optimal mechanism is equivalent to solving optimally a Dec-SMDP-Com. We also provide a heuristic search algorithm that converges on the optimal decomposition. Restricting the decomposition to some specific types of local behaviors reduces significantly the complexity of planning. In particular, we present a polynomial-time algorithm for the case in which individual agents perform goal-oriented behaviors between communications. The paper concludes with an additional tractable algorithm that enables the introduction of human knowledge, thereby reducing the overall problem to finding the best time to communicate. Empirical results show that these approaches provide good approximate solutions.
Abstract:The system presented here shows the feasibility of modeling the knowledge involved in a complex musical activity by integrating sub-symbolic and symbolic processes. This research focuses on the question of whether there is any advantage in integrating a neural network together with a distributed artificial intelligence approach within the music domain. The primary purpose of our work is to design a model that describes the different aspects a user might be interested in considering when involved in a musical activity. The approach we suggest in this work enables the musician to encode his knowledge, intuitions, and aesthetic taste into different modules. The system captures these aspects by computing and applying three distinct functions: rules, fuzzy concepts, and learning. As a case study, we began experimenting with first species two-part counterpoint melodies. We have developed a hybrid system composed of a connectionist module and an agent-based module to combine the sub-symbolic and symbolic levels to achieve this task. The technique presented here to represent musical knowledge constitutes a new approach for composing polyphonic music.