Motivated by the promising advances of deep-reinforcement learning (DRL) applied to cooperative multi-agent systems we propose a model and learning procedure to solve the Capacitated Multi-Vehicle Routing Problem (CMVRP) with fixed fleet size. Our learning procedure follows a centralized-training and decentralized-execution paradigm. We empirically test our model and showed its capability for producing near-optimal solutions through cooperative actions. In large instances, our model generates better solutions than other commonly used heuristics. Additionally, our model can solve arbitrary instances of the CMVRP without requiring re-training.

Title:Deep Reinforcement Learning for Routing a Heterogeneous Fleet of Vehicles

Paper and Code