Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xuanhao Pan

Rethinking the "Heatmap + Monte Carlo Tree Search" Paradigm for Solving Large Scale TSP

Nov 14, 2024

Xuanhao Pan, Chenguang Wang, Chaolong Ying, Ye Xue, Tianshu Yu

Figure 1 for Rethinking the "Heatmap + Monte Carlo Tree Search" Paradigm for Solving Large Scale TSP

Figure 2 for Rethinking the "Heatmap + Monte Carlo Tree Search" Paradigm for Solving Large Scale TSP

Figure 3 for Rethinking the "Heatmap + Monte Carlo Tree Search" Paradigm for Solving Large Scale TSP

Figure 4 for Rethinking the "Heatmap + Monte Carlo Tree Search" Paradigm for Solving Large Scale TSP

Abstract:The Travelling Salesman Problem (TSP) remains a fundamental challenge in combinatorial optimization, inspiring diverse algorithmic strategies. This paper revisits the "heatmap + Monte Carlo Tree Search (MCTS)" paradigm that has recently gained traction for learning-based TSP solutions. Within this framework, heatmaps encode the likelihood of edges forming part of the optimal tour, and MCTS refines this probabilistic guidance to discover optimal solutions. Contemporary approaches have predominantly emphasized the refinement of heatmap generation through sophisticated learning models, inadvertently sidelining the critical role of MCTS. Our extensive empirical analysis reveals two pivotal insights: 1) The configuration of MCTS strategies profoundly influences the solution quality, demanding meticulous tuning to leverage their full potential; 2) Our findings demonstrate that a rudimentary and parameter-free heatmap, derived from the intrinsic $k$-nearest nature of TSP, can rival or even surpass the performance of complicated heatmaps, with strong generalizability across various scales. Empirical evaluations across various TSP scales underscore the efficacy of our approach, achieving competitive results. These observations challenge the prevailing focus on heatmap sophistication, advocating a reevaluation of the paradigm to harness both components synergistically. Our code is available at: https://github.com/LOGO-CUHKSZ/rethink_mcts_tsp.

Via

Access Paper or Ask Questions

Towards Principled Task Grouping for Multi-Task Learning

Feb 23, 2024

Chenguang Wang, Xuanhao Pan, Tianshu Yu

Abstract:This paper presents a novel approach to task grouping in Multitask Learning (MTL), advancing beyond existing methods by addressing key theoretical and practical limitations. Unlike prior studies, our approach offers a more theoretically grounded method that does not rely on restrictive assumptions for constructing transfer gains. We also propose a flexible mathematical programming formulation which can accommodate a wide spectrum of resource constraints, thus enhancing its versatility. Experimental results across diverse domains, including computer vision datasets, combinatorial optimization benchmarks and time series tasks, demonstrate the superiority of our method over extensive baselines, validating its effectiveness and general applicability in MTL.

Via

Access Paper or Ask Questions

Pointerformer: Deep Reinforced Multi-Pointer Transformer for the Traveling Salesman Problem

Apr 19, 2023

Yan Jin, Yuandong Ding, Xuanhao Pan, Kun He, Li Zhao, Tao Qin, Lei Song, Jiang Bian

Figure 1 for Pointerformer: Deep Reinforced Multi-Pointer Transformer for the Traveling Salesman Problem

Figure 2 for Pointerformer: Deep Reinforced Multi-Pointer Transformer for the Traveling Salesman Problem

Figure 3 for Pointerformer: Deep Reinforced Multi-Pointer Transformer for the Traveling Salesman Problem

Figure 4 for Pointerformer: Deep Reinforced Multi-Pointer Transformer for the Traveling Salesman Problem

Abstract:Traveling Salesman Problem (TSP), as a classic routing optimization problem originally arising in the domain of transportation and logistics, has become a critical task in broader domains, such as manufacturing and biology. Recently, Deep Reinforcement Learning (DRL) has been increasingly employed to solve TSP due to its high inference efficiency. Nevertheless, most of existing end-to-end DRL algorithms only perform well on small TSP instances and can hardly generalize to large scale because of the drastically soaring memory consumption and computation time along with the enlarging problem scale. In this paper, we propose a novel end-to-end DRL approach, referred to as Pointerformer, based on multi-pointer Transformer. Particularly, Pointerformer adopts both reversible residual network in the encoder and multi-pointer network in the decoder to effectively contain memory consumption of the encoder-decoder architecture. To further improve the performance of TSP solutions, Pointerformer employs both a feature augmentation method to explore the symmetries of TSP at both training and inference stages as well as an enhanced context embedding approach to include more comprehensive context information in the query. Extensive experiments on a randomly generated benchmark and a public benchmark have shown that, while achieving comparative results on most small-scale TSP instances as SOTA DRL approaches do, Pointerformer can also well generalize to large-scale TSPs.

* Accepted by AAAI 2023, February 2023

Via

Access Paper or Ask Questions

H-TSP: Hierarchically Solving the Large-Scale Travelling Salesman Problem

Apr 19, 2023

Xuanhao Pan, Yan Jin, Yuandong Ding, Mingxiao Feng, Li Zhao, Lei Song, Jiang Bian

Figure 1 for H-TSP: Hierarchically Solving the Large-Scale Travelling Salesman Problem

Figure 2 for H-TSP: Hierarchically Solving the Large-Scale Travelling Salesman Problem

Figure 3 for H-TSP: Hierarchically Solving the Large-Scale Travelling Salesman Problem

Figure 4 for H-TSP: Hierarchically Solving the Large-Scale Travelling Salesman Problem

Abstract:We propose an end-to-end learning framework based on hierarchical reinforcement learning, called H-TSP, for addressing the large-scale Travelling Salesman Problem (TSP). The proposed H-TSP constructs a solution of a TSP instance starting from the scratch relying on two components: the upper-level policy chooses a small subset of nodes (up to 200 in our experiment) from all nodes that are to be traversed, while the lower-level policy takes the chosen nodes as input and outputs a tour connecting them to the existing partial route (initially only containing the depot). After jointly training the upper-level and lower-level policies, our approach can directly generate solutions for the given TSP instances without relying on any time-consuming search procedures. To demonstrate effectiveness of the proposed approach, we have conducted extensive experiments on randomly generated TSP instances with different numbers of nodes. We show that H-TSP can achieve comparable results (gap 3.42% vs. 7.32%) as SOTA search-based approaches, and more importantly, we reduce the time consumption up to two orders of magnitude (3.32s vs. 395.85s). To the best of our knowledge, H-TSP is the first end-to-end deep reinforcement learning approach that can scale to TSP instances of up to 10000 nodes. Although there are still gaps to SOTA results with respect to solution quality, we believe that H-TSP will be useful for practical applications, particularly those that are time-sensitive e.g., on-call routing and ride hailing service.

* Accepted by AAAI 2023, February 2023

Via

Access Paper or Ask Questions