Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ryo Kuroiwa

Reinforcement Learning-based Heuristics to Guide Domain-Independent Dynamic Programming

Mar 20, 2025

Minori Narita, Ryo Kuroiwa, J. Christopher Beck

Abstract:Domain-Independent Dynamic Programming (DIDP) is a state-space search paradigm based on dynamic programming for combinatorial optimization. In its current implementation, DIDP guides the search using user-defined dual bounds. Reinforcement learning (RL) is increasingly being applied to combinatorial optimization problems and shares several key structures with DP, being represented by the Bellman equation and state-based transition systems. We propose using reinforcement learning to obtain a heuristic function to guide the search in DIDP. We develop two RL-based guidance approaches: value-based guidance using Deep Q-Networks and policy-based guidance using Proximal Policy Optimization. Our experiments indicate that RL-based guidance significantly outperforms standard DIDP and problem-specific greedy heuristics with the same number of node expansions. Further, despite longer node evaluation times, RL guidance achieves better run-time performance than standard DIDP on three of four benchmark domains.

* 24 pages, 4 figures, to be published in CPAIOR 2025 (https://sites.google.com/view/cpaior2025)

Via

Access Paper or Ask Questions

Domain-Independent Dynamic Programming

Jan 25, 2024

Ryo Kuroiwa, J. Christopher Beck

Abstract:For combinatorial optimization problems, model-based paradigms such as mixed-integer programming (MIP) and constraint programming (CP) aim to decouple modeling and solving a problem: the `holy grail' of declarative problem solving. We propose domain-independent dynamic programming (DIDP), a new model-based paradigm based on dynamic programming (DP). While DP is not new, it has typically been implemented as a problem-specific method. We introduce Dynamic Programming Description Language (DyPDL), a formalism to define DP models based on a state transition system, inspired by AI planning. We show that heuristic search algorithms can be used to solve DyPDL models and propose seven DIDP solvers. We experimentally compare our DIDP solvers with commercial MIP and CP solvers (solving MIP and CP models, respectively) on common benchmark instances of eleven combinatorial optimization problem classes. We show that DIDP outperforms MIP in nine problem classes, CP also in nine problem classes, and both MIP and CP in seven.

* Manuscript submitted to JACM

Via

Access Paper or Ask Questions

Learning Search-Space Specific Heuristics Using Neural Networks

Jun 06, 2023

Yu Liu, Ryo Kuroiwa, Alex Fukunaga

Abstract:We propose and evaluate a system which learns a neuralnetwork heuristic function for forward search-based, satisficing classical planning. Our system learns distance-to-goal estimators from scratch, given a single PDDL training instance. Training data is generated by backward regression search or by backward search from given or guessed goal states. In domains such as the 24-puzzle where all instances share the same search space, such heuristics can also be reused across all instances in the domain. We show that this relatively simple system can perform surprisingly well, sometimes competitive with well-known domain-independent heuristics.

* Proceedings of ICAPS Workshop on Heuristics and Search for Domain-independent Planning (HSDIP) 2020, pp.1-8

Via

Access Paper or Ask Questions

Domain-Independent Dynamic Programming: Generic State Space Search for Combinatorial Optimization

Nov 26, 2022

Ryo Kuroiwa, J. Christopher Beck

Abstract:For combinatorial optimization problems, model-based approaches such as mixed-integer programming (MIP) and constraint programming (CP) aim to decouple modeling and solving a problem: the 'holy grail' of declarative problem solving. We propose domain-independent dynamic programming (DIDP), a new model-based paradigm based on dynamic programming (DP). While DP is not new, it has typically been implemented as a problem-specific method. We propose Dynamic Programming Description Language (DyPDL), a formalism to define DP models, and develop Cost-Algebraic A* Solver for DyPDL (CAASDy), a generic solver for DyPDL using state space search. We formalize existing problem-specific DP and state space search methods for combinatorial optimization problems as DP models in DyPDL. Using CAASDy and commercial MIP and CP solvers, we experimentally compare the DP models with existing MIP and CP models, showing that, despite its nascent nature, CAASDy outperforms MIP and CP on a number of common problem classes.

Via

Access Paper or Ask Questions