Abstract:The research of extending deep reinforcement learning (drl) to multi-agent field has solved many complicated problems and made great achievements. However, almost all these studies only focus on discrete or continuous action space and there are few works having ever used multi-agent deep reinforcement learning to real-world environment problems which mostly have a hybrid action space. Therefore, in this paper, we propose two algorithms: deep multi-agent hybrid soft actor-critic (MAHSAC) and multi-agent hybrid deep deterministic policy gradients (MAHDDPG) to fill this gap. This two algorithms follow the centralized training and decentralized execution (CTDE) paradigm and could handle hybrid action space problems. Our experiences are running on multi-agent particle environment which is an easy multi-agent particle world, along with some basic simulated physics. The experimental results show that these algorithms have good performances.
Abstract:Multi-agent deep reinforcement learning has been applied to address a variety of complex problems with either discrete or continuous action spaces and achieved great success. However, most real-world environments cannot be described by only discrete action spaces or only continuous action spaces. And there are few works having ever utilized deep reinforcement learning (drl) to multi-agent problems with hybrid action spaces. Therefore, we propose a novel algorithm: Deep Multi-Agent Hybrid Soft Actor-Critic (MAHSAC) to fill this gap. This algorithm follows the centralized training but decentralized execution (CTDE) paradigm, and extend the Soft Actor-Critic algorithm (SAC) to handle hybrid action space problems in Multi-Agent environments based on maximum entropy. Our experiences are running on an easy multi-agent particle world with a continuous observation and discrete action space, along with some basic simulated physics. The experimental results show that MAHSAC has good performance in training speed, stability, and anti-interference ability. At the same time, it outperforms existing independent deep hybrid learning method in cooperative scenarios and competitive scenarios.
Abstract:Ensemble learning methods whose base classifier is a decision tree usually belong to the bagging or boosting. However, no previous work has ever built the ensemble classifier by maximizing long-term returns to the best of our knowledge. This paper proposes a decision forest building method called MA-H-SAC-DF for binary classification via deep reinforcement learning. First, the building process is modeled as a decentralized partial observable Markov decision process, and a set of cooperative agents jointly constructs all base classifiers. Second, the global state and local observations are defined based on informations of the parent node and the current location. Last, the state-of-the-art deep reinforcement method Hybrid SAC is extended to a multi-agent system under the CTDE architecture to find an optimal decision forest building policy. The experiments indicate that MA-H-SAC-DF has the same performance as random forest, Adaboost, and GBDT on balanced datasets and outperforms them on imbalanced datasets.