Abstract:Network energy efficiency is a main pillar in the design and operation of wireless communication systems. In this paper, we investigate a dense radio access network (dense-RAN) capable of radiated power management at the base station (BS). Aiming to improve the long-term network energy efficiency, an optimization problem is formulated by collaboratively managing multi-BSs radiated power levels with constraints on the users traffic volume and achievable rate. Considering stochastic traffic arrivals at the users and time-varying network interference, we first formulate the problem as a Markov decision process (MDP) and then develop a novel deep reinforcement learning (DRL) framework based on the cloud-RAN operation scheme. To tackle the trade-off between complexity and performance, the overall optimization of multi-BSs energy efficiency with the multiplicative complexity constraint is modeled to achieve nearoptimal performance by using a deep Q-network (DQN). In DQN,each BS first maximizes its individual energy efficiency, and then cooperates with other BSs to maximize the overall multiBSs energy efficiency. Simulation results demonstrate that the proposed algorithm can converge faster and enjoy a network energy efficiency improvement by 5% and 10% compared with the benchmarks of the Q-learning and sleep schemes, respectively.