Abstract:This paper considers opportunistic scheduler (OS) design using statistical channel state information~(CSI). We apply max-weight schedulers (MWSs) to maximize a utility function of users' average data rates. MWSs schedule the user with the highest weighted instantaneous data rate every time slot. Existing methods require hundreds of time slots to adjust the MWS's weights according to the instantaneous CSI before finding the optimal weights that maximize the utility function. In contrast, our MWS design requires few slots for estimating the statistical CSI. Specifically, we formulate a weight optimization problem using the mean and variance of users' signal-to-noise ratios (SNRs) to construct constraints bounding users' feasible average rates. Here, the utility function is the formulated objective, and the MWS's weights are optimization variables. We develop an iterative solver for the problem and prove that it finds the optimal weights. We also design an online architecture where the solver adaptively generates optimal weights for networks with varying mean and variance of the SNRs. Simulations show that our methods effectively require $4\sim10$ times fewer slots to find the optimal weights and achieve $5\sim15\%$ better average rates than the existing methods.
Abstract:Restricted access window (RAW) in Wi-Fi 802.11ah networks manages contention and interference by grouping users and allocating periodic time slots for each group's transmissions. We will find the optimal user grouping decisions in RAW to maximize the network's worst-case user throughput. We review existing user grouping approaches and highlight their performance limitations in the above problem. We propose formulating user grouping as a graph construction problem where vertices represent users and edge weights indicate the contention and interference. This formulation leverages the graph's max cut to group users and optimizes edge weights to construct the optimal graph whose max cut yields the optimal grouping decisions. To achieve this optimal graph construction, we design an actor-critic graph representation learning (AC-GRL) algorithm. Specifically, the actor neural network (NN) is trained to estimate the optimal graph's edge weights using path losses between users and access points. A graph cut procedure uses semidefinite programming to solve the max cut efficiently and return the grouping decisions for the given weights. The critic NN approximates user throughput achieved by the above-returned decisions and is used to improve the actor. Additionally, we present an architecture that uses the online-measured throughput and path losses to fine-tune the decisions in response to changes in user populations and their locations. Simulations show that our methods achieve $30\%\sim80\%$ higher worst-case user throughput than the existing approaches and that the proposed architecture can further improve the worst-case user throughput by $5\%\sim30\%$ while ensuring timely updates of grouping decisions.
Abstract:In this paper, we develop a knowledge-assisted deep reinforcement learning (DRL) algorithm to design wireless schedulers in the fifth-generation (5G) cellular networks with time-sensitive traffic. Since the scheduling policy is a deterministic mapping from channel and queue states to scheduling actions, it can be optimized by using deep deterministic policy gradient (DDPG). We show that a straightforward implementation of DDPG converges slowly, has a poor quality-of-service (QoS) performance, and cannot be implemented in real-world 5G systems, which are non-stationary in general. To address these issues, we propose a theoretical DRL framework, where theoretical models from wireless communications are used to formulate a Markov decision process in DRL. To reduce the convergence time and improve the QoS of each user, we design a knowledge-assisted DDPG (K-DDPG) that exploits expert knowledge of the scheduler deign problem, such as the knowledge of the QoS, the target scheduling policy, and the importance of each training sample, determined by the approximation error of the value function and the number of packet losses. Furthermore, we develop an architecture for online training and inference, where K-DDPG initializes the scheduler off-line and then fine-tunes the scheduler online to handle the mismatch between off-line simulations and non-stationary real-world systems. Simulation results show that our approach reduces the convergence time of DDPG significantly and achieves better QoS than existing schedulers (reducing 30% ~ 50% packet losses). Experimental results show that with off-line initialization, our approach achieves better initial QoS than random initialization and the online fine-tuning converges in few minutes.
Abstract:As one of the key communication scenarios in the 5th and also the 6th generation (6G) cellular networks, ultra-reliable and low-latency communications (URLLC) will be central for the development of various emerging mission-critical applications. The state-of-the-art mobile communication systems do not fulfill the end-to-end delay and overall reliability requirements of URLLC. A holistic framework that takes into account latency, reliability, availability, scalability, and decision-making under uncertainty is lacking. Driven by recent breakthroughs in deep neural networks, deep learning algorithms have been considered as promising ways of developing enabling technologies for URLLC in future 6G networks. This tutorial illustrates how to integrate theoretical knowledge (models, analysis tools, and optimization frameworks) of wireless communications into different kinds of deep learning algorithms for URLLC. We first introduce the background of URLLC and review promising network architectures and deep learning frameworks in 6G. To better illustrate how to improve learning algorithms with theoretical knowledge, we revisit model-based analysis tools and cross-layer optimization frameworks for URLLC. Following that, we examine the potential of applying supervised/unsupervised deep learning and deep reinforcement learning in URLLC and summarize related open problems. Finally, we provide simulation and experimental results to validate the effectiveness of different learning algorithms and discuss future directions.
Abstract:In the future 6th generation networks, ultra-reliable and low-latency communications (URLLC) will lay the foundation for emerging mission-critical applications that have stringent requirements on end-to-end delay and reliability. Existing works on URLLC are mainly based on theoretical models and assumptions. The model-based solutions provide useful insights, but cannot be directly implemented in practice. In this article, we first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC, and discuss some open problems of these methods. To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC. The basic idea is to merge theoretical models and real-world data in analyzing the latency and reliability and training deep neural networks (DNNs). Deep transfer learning is adopted in the architecture to fine-tune the pre-trained DNNs in non-stationary networks. Further considering that the computing capacity at each user and each mobile edge computing server is limited, federated learning is applied to improve the learning efficiency. Finally, we provide some experimental and simulation results and discuss some future directions.