Abstract:This study develops a framework based on reinforcement learning to dynamically manage a large portfolio of search operators within meta-heuristics. Using the idea of tabu search, the framework allows for continuous adaptation by temporarily excluding less efficient operators and updating the portfolio composition during the search. A Q-learning-based adaptive operator selection mechanism is used to select the most suitable operator from the dynamically updated portfolio at each stage. Unlike traditional approaches, the proposed framework requires no input from the experts regarding the search operators, allowing domain-specific non-experts to effectively use the framework. The performance of the proposed framework is analyzed through an application to the permutation flowshop scheduling problem. The results demonstrate the superior performance of the proposed framework against state-of-the-art algorithms in terms of optimality gap and convergence speed.
Abstract:Despite the extensive research efforts and the remarkable results obtained on Vehicle Routing Problems (VRP) by using algorithms proposed by the Machine Learning community that are partially or entirely based on data-driven analysis, most of these approaches are still seldom employed by the Operations Research (OR) community. Among the possible causes, we believe, the different approach to the computational evaluation of the proposed methods may play a major role. With the current work, we want to highlight a number of challenges (and possible ways to handle them) arising during the computational studies of heuristic approaches to VRPs that, if appropriately addressed, may produce a computational study having the characteristics of those presented in OR papers, thus hopefully promoting the collaboration between the two communities.