Abstract:By exploiting the superiority of non-orthogonal multiple access (NOMA), NOMA-aided mobile edge computing (MEC) can provide scalable and low-latency computing services for the Internet of Things. However, given the prevalent stochasticity of wireless networks and sophisticated signal processing of NOMA, it is critical but challenging to design an efficient task offloading algorithm for NOMA-aided MEC, especially under a large number of devices. This paper presents an online algorithm that jointly optimizes offloading decisions and resource allocation to maximize the long-term system utility (i.e., a measure of throughput and fairness). Since the optimization variables are temporary coupled, we first apply Lyapunov technique to decouple the long-term stochastic optimization into a series of per-slot deterministic subproblems, which does not require any prior knowledge of network dynamics. Second, we propose to transform the non-convex per-slot subproblem of optimizing NOMA power allocation equivalently to a convex form by introducing a set of auxiliary variables, whereby the time-complexity is reduced from the exponential complexity to $\mathcal{O} (M^{3/2})$. The proposed algorithm is proved to be asymptotically optimal, even under partial knowledge of the device states at the base station. Simulation results validate the superiority of the proposed algorithm in terms of system utility, stability improvement, and the overhead reduction.
Abstract:Cell-free network is considered as a promising architecture for satisfying more demands of future wireless networks, where distributed access points coordinate with an edge cloud processor to jointly provide service to a smaller number of user equipments in a compact area. In this paper, the problem of uplink beamforming design is investigated for maximizing the long-term energy efficiency (EE) with the aid of deep reinforcement learning (DRL) in the cell-free network. Firstly, based on the minimum mean square error channel estimation and exploiting successive interference cancellation for signal detection, the expression of signal to interference plus noise ratio (SINR) is derived. Secondly, according to the formulation of SINR, we define the long-term EE, which is a function of beamforming matrix. Thirdly, to address the dynamic beamforming design with continuous state and action space, a DRL-enabled beamforming design is proposed based on deep deterministic policy gradient (DDPG) algorithm by taking the advantage of its double-network architecture. Finally, the results of simulation indicate that the DDPG-based beamforming design is capable of converging to the optimal EE performance. Furthermore, the influence of hyper-parameters on the EE performance of the DDPG-based beamforming design is investigated, and it is demonstrated that an appropriate discount factor and hidden layers size can facilitate the EE performance.