Abstract:In Multi-objective Reinforcement Learning (MORL) agents are tasked with optimising decision-making behaviours that trade-off between multiple, possibly conflicting, objectives. MORL based on decomposition is a family of solution methods that employ a number of utility functions to decompose the multi-objective problem into individual single-objective problems solved simultaneously in order to approximate a Pareto front of policies. We focus on the case of linear utility functions parameterised by weight vectors w. We introduce a method based on Upper Confidence Bound to efficiently search for the most promising weight vectors during different stages of the learning process, with the aim of maximising the hypervolume of the resulting Pareto front. The proposed method is shown to outperform various MORL baselines on Mujoco benchmark problems across different random seeds. The code is online at: https://github.com/SYCAMORE-1/ucb-MOPPO.
Abstract:The explosion in mobile data traffic together with the ever-increasing expectations for higher quality of service call for the development of AI algorithms for wireless network optimization. In this paper, we investigate how to learn policies that can automatically adjust the configuration parameters of every cell in the network in response to the changes in the user demand. Our solution combines existent methods for offline learning and adapts them in a principled way to overcome crucial challenges arising in this context. Empirical results suggest that our proposed method will achieve important performance gains when deployed in the real network while satisfying practical constrains on computational efficiency.
Abstract:We describe two applications of machine learning in the context of IP/Optical networks. The first one allows agile management of resources at a core IP/Optical network by using machine learning for short-term and long-term prediction of traffic flows and joint global optimization of IP and optical layers using colorless/directionless (CD) flexible ROADMs. Multilayer coordination allows for significant cost savings, flexible new services to meet dynamic capacity needs, and improved robustness by being able to proactively adapt to new traffic patterns and network conditions. The second application is important as we migrate our metro networks to Open ROADM networks, to allow physical routing without the need for detailed knowledge of optical parameters. We discuss a proof-of-concept study, where detailed performance data for wavelengths on a current flexible ROADM network is used for machine learning to predict the optical performance of each wavelength. Both applications can be efficiently implemented by using a SDN (Software Defined Network) controller.