Abstract:Fifth-generation (5G) New Radio (NR) cellular networks support a wide range of new services, many of which require an application-specific quality of service (QoS), e.g. in terms of a guaranteed minimum bit-rate or a maximum tolerable delay. Therefore, scheduling multiple parallel data flows, each serving a unique application instance, is bound to become an even more challenging task compared to the previous generations. Leveraging recent advances in deep reinforcement learning, in this paper, we propose a QoS-Aware Deep Reinforcement learning Agent (QADRA) scheduler for NR networks. In contrast to state-of-the-art scheduling heuristics, the QADRA scheduler explicitly optimizes for the QoS satisfaction rate while simultaneously maximizing the network performance. Moreover, we train our algorithm end-to-end on these objectives. We evaluate QADRA in a full scale, near-product, system level NR simulator and demonstrate a significant boost in network performance. In our particular evaluation scenario, the QADRA scheduler improves network throughput by 30% while simultaneously maintaining the QoS satisfaction rate of VoIP users served by the network, compared to state-of-the-art baselines.