Abstract:In Multi-objective Reinforcement Learning (MORL) agents are tasked with optimising decision-making behaviours that trade-off between multiple, possibly conflicting, objectives. MORL based on decomposition is a family of solution methods that employ a number of utility functions to decompose the multi-objective problem into individual single-objective problems solved simultaneously in order to approximate a Pareto front of policies. We focus on the case of linear utility functions parameterised by weight vectors w. We introduce a method based on Upper Confidence Bound to efficiently search for the most promising weight vectors during different stages of the learning process, with the aim of maximising the hypervolume of the resulting Pareto front. The proposed method is shown to outperform various MORL baselines on Mujoco benchmark problems across different random seeds. The code is online at: https://github.com/SYCAMORE-1/ucb-MOPPO.
Abstract:Coverage and capacity are the important metrics for performance evaluation in wireless networks, while the coverage and capacity have several conflicting relationships, e.g. high transmit power contributes to large coverage but high inter-cell interference reduces the capacity performance. Therefore, in order to strike a balance between the coverage and capacity, a novel model is proposed for the coverage and capacity optimization of simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs) assisted networks. To solve the coverage and capacity optimization (CCO) problem, a machine learning-based multi-objective optimization algorithm, i.e., the multi-objective proximal policy optimization (MO-PPO) algorithm, is proposed. In this algorithm, a loss function-based update strategy is the core point, which is able to calculate weights for both loss functions of coverage and capacity by a min-norm solver at each update. The numerical results demonstrate that the investigated update strategy outperforms the fixed weight-based MO algorithms.
Abstract:The explosion in mobile data traffic together with the ever-increasing expectations for higher quality of service call for the development of AI algorithms for wireless network optimization. In this paper, we investigate how to learn policies that can automatically adjust the configuration parameters of every cell in the network in response to the changes in the user demand. Our solution combines existent methods for offline learning and adapts them in a principled way to overcome crucial challenges arising in this context. Empirical results suggest that our proposed method will achieve important performance gains when deployed in the real network while satisfying practical constrains on computational efficiency.
Abstract:Vertical Total Electron Content (vTEC) is an ionospheric characteristic used to derive the signal delay imposed by the ionosphere on near-vertical trans-ionospheric links. The major aim of this paper is to design a prediction model based on the main factors that influence the variability of this parameter on a diurnal, seasonal and long-term time-scale. The model should be accurate and general (comprehensive) enough for efficiently approximating the high variations of vTEC. However, good approximation and generalization are conflicting objectives. For this reason a Genetic Programming (GP) with Multi-objective Evolutionary Algorithm based on Decomposition characteristics (GP-MOEA/D) is designed and proposed for modeling vTEC over Cyprus. Experimental results show that the Multi-Objective GP-model, considering real vTEC measurements obtained over a period of 11 years, has produced a good approximation of the modeled parameter and can be implemented as a local model to account for the ionospheric imposed error in positioning. Particulary, the GP-MOEA/D approach performs better than a Single Objective Optimization GP, a GP with Non-dominated Sorting Genetic Algorithm-II (NSGA-II) characteristics and the previously proposed Neural Network-based approach in most cases.