Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

David Lynch

UCB-driven Utility Function Search for Multi-objective Reinforcement Learning

May 01, 2024

Yucheng Shi, Alexandros Agapitos, David Lynch, Giorgio Cruciata, Hao Wang, Yayu Yao, Aleksandar Milenovic

Figure 1 for UCB-driven Utility Function Search for Multi-objective Reinforcement Learning

Figure 2 for UCB-driven Utility Function Search for Multi-objective Reinforcement Learning

Figure 3 for UCB-driven Utility Function Search for Multi-objective Reinforcement Learning

Figure 4 for UCB-driven Utility Function Search for Multi-objective Reinforcement Learning

Abstract:In Multi-objective Reinforcement Learning (MORL) agents are tasked with optimising decision-making behaviours that trade-off between multiple, possibly conflicting, objectives. MORL based on decomposition is a family of solution methods that employ a number of utility functions to decompose the multi-objective problem into individual single-objective problems solved simultaneously in order to approximate a Pareto front of policies. We focus on the case of linear utility functions parameterised by weight vectors w. We introduce a method based on Upper Confidence Bound to efficiently search for the most promising weight vectors during different stages of the learning process, with the aim of maximising the hypervolume of the resulting Pareto front. The proposed method is shown to outperform various MORL baselines on Mujoco benchmark problems across different random seeds. The code is online at: https://github.com/SYCAMORE-1/ucb-MOPPO.

Via

Access Paper or Ask Questions

Offline Contextual Bandits for Wireless Network Optimization

Nov 11, 2021

Miguel Suau, Alexandros Agapitos, David Lynch, Derek Farrell, Mingqi Zhou, Aleksandar Milenovic

Figure 1 for Offline Contextual Bandits for Wireless Network Optimization

Figure 2 for Offline Contextual Bandits for Wireless Network Optimization

Figure 3 for Offline Contextual Bandits for Wireless Network Optimization

Abstract:The explosion in mobile data traffic together with the ever-increasing expectations for higher quality of service call for the development of AI algorithms for wireless network optimization. In this paper, we investigate how to learn policies that can automatically adjust the configuration parameters of every cell in the network in response to the changes in the user demand. Our solution combines existent methods for offline learning and adapts them in a principled way to overcome crucial challenges arising in this context. Empirical results suggest that our proposed method will achieve important performance gains when deployed in the real network while satisfying practical constrains on computational efficiency.

Via

Access Paper or Ask Questions

Two Use Cases of Machine Learning for SDN-Enabled IP/Optical Networks: Traffic Matrix Prediction and Optical Path Performance Prediction

Jun 12, 2018

Gagan Choudhury, David Lynch, Gaurav Thakur, Simon Tse

Figure 1 for Two Use Cases of Machine Learning for SDN-Enabled IP/Optical Networks: Traffic Matrix Prediction and Optical Path Performance Prediction

Figure 2 for Two Use Cases of Machine Learning for SDN-Enabled IP/Optical Networks: Traffic Matrix Prediction and Optical Path Performance Prediction

Figure 3 for Two Use Cases of Machine Learning for SDN-Enabled IP/Optical Networks: Traffic Matrix Prediction and Optical Path Performance Prediction

Figure 4 for Two Use Cases of Machine Learning for SDN-Enabled IP/Optical Networks: Traffic Matrix Prediction and Optical Path Performance Prediction

Abstract:We describe two applications of machine learning in the context of IP/Optical networks. The first one allows agile management of resources at a core IP/Optical network by using machine learning for short-term and long-term prediction of traffic flows and joint global optimization of IP and optical layers using colorless/directionless (CD) flexible ROADMs. Multilayer coordination allows for significant cost savings, flexible new services to meet dynamic capacity needs, and improved robustness by being able to proactively adapt to new traffic patterns and network conditions. The second application is important as we migrate our metro networks to Open ROADM networks, to allow physical routing without the need for detailed knowledge of optical parameters. We discuss a proof-of-concept study, where detailed performance data for wavelengths on a current flexible ROADM network is used for machine learning to predict the optical performance of each wavelength. Both applications can be efficiently implemented by using a SDN (Software Defined Network) controller.

Via

Access Paper or Ask Questions