Abstract:This paper considers the problem of cost-aware downlink sum-rate maximization via joint optimal radio access technologies (RATs) assignment and power allocation in next-generation heterogeneous wireless networks (HetNets). We consider a future HetNet comprised of multi-RATs and serving multi-connectivity edge devices (EDs), and we formulate the problem as mixed-integer non-linear programming (MINP) problem. Due to the high complexity and combinatorial nature of this problem and the difficulty to solve it using conventional methods, we propose a hierarchical multi-agent deep reinforcement learning (DRL)-based framework, called DeepRAT, to solve it efficiently and learn system dynamics. In particular, the DeepRAT framework decomposes the problem into two main stages; the RATs-EDs assignment stage, which implements a single-agent Deep Q Network (DQN) algorithm, and the power allocation stage, which utilizes a multi-agent Deep Deterministic Policy Gradient (DDPG) algorithm. Using simulations, we demonstrate how the various DRL agents efficiently interact to learn system dynamics and derive the global optimal policy. Furthermore, our simulation results show that the proposed DeepRAT algorithm outperforms existing state-of-the-art heuristic approaches in terms of network utility. Finally, we quantitatively show the ability of the DeepRAT model to quickly and dynamically adapt to abrupt changes in network dynamics, such as EDs mobility.
Abstract:Next generation wireless networks are expected to be extremely complex due to their massive heterogeneity in terms of the types of network architectures they incorporate, the types and numbers of smart IoT devices they serve, and the types of emerging applications they support. In such large-scale and heterogeneous networks (HetNets), radio resource allocation and management (RRAM) becomes one of the major challenges encountered during system design and deployment. In this context, emerging Deep Reinforcement Learning (DRL) techniques are expected to be one of the main enabling technologies to address the RRAM in future wireless HetNets. In this paper, we conduct a systematic in-depth, and comprehensive survey of the applications of DRL techniques in RRAM for next generation wireless networks. Towards this, we first overview the existing traditional RRAM methods and identify their limitations that motivate the use of DRL techniques in RRAM. Then, we provide a comprehensive review of the most widely used DRL algorithms to address RRAM problems, including the value- and policy-based algorithms. The advantages, limitations, and use-cases for each algorithm are provided. We then conduct a comprehensive and in-depth literature review and classify existing related works based on both the radio resources they are addressing and the type of wireless networks they are investigating. To this end, we carefully identify the types of DRL algorithms utilized in each related work, the elements of these algorithms, and the main findings of each related work. Finally, we highlight important open challenges and provide insights into several future research directions in the context of DRL-based RRAM. This survey is intentionally designed to guide and stimulate more research endeavors towards building efficient and fine-grained DRL-based RRAM schemes for future wireless networks.