A key challenge in solving a combinatorial optimization problem is how to guide the agent (i.e., solver) to efficiently explore the enormous search space. Conventional approaches often rely on enumeration (e.g., exhaustive, random, or tabu search) or have to restrict the exploration to rather limited regions (e.g., a single path as in iterative algorithms). In this paper, we show it is possible to use machine learning to speedup the exploration. In particular, a value network is trained to evaluate solution candidates, which provides a useful structure (i.e., an approximate value surface) over the search space; this value network is then used to screen solutions to help a black-box optimization agent to initialize or restart so as to navigate through the search space towards desirable solutions. Experiments demonstrate that the proposed ``Learn to Restart'' algorithm achieves promising results in solving Capacitated Vehicle Routing Problems (CVRPs).

Title:Learning (Re-)Starting Solutions for Vehicle Routing Problems

Paper and Code