Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michał Zawalski

What Matters in Hierarchical Search for Combinatorial Reasoning Problems?

Jun 05, 2024

Michał Zawalski, Gracjan Góral, Michał Tyrolski, Emilia Wiśnios, Franciszek Budrowski, Łukasz Kuciński, Piotr Miłoś

Abstract:Efficiently tackling combinatorial reasoning problems, particularly the notorious NP-hard tasks, remains a significant challenge for AI research. Recent efforts have sought to enhance planning by incorporating hierarchical high-level search strategies, known as subgoal methods. While promising, their performance against traditional low-level planners is inconsistent, raising questions about their application contexts. In this study, we conduct an in-depth exploration of subgoal-planning methods for combinatorial reasoning. We identify the attributes pivotal for leveraging the advantages of high-level search: hard-to-learn value functions, complex action spaces, presence of dead ends in the environment, or using data collected from diverse experts. We propose a consistent evaluation methodology to achieve meaningful comparisons between methods and reevaluate the state-of-the-art algorithms.

* Accepted for Generative Models for Decision Making Workshop at ICLR 2024

Via

Access Paper or Ask Questions

Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search

Jun 01, 2022

Michał Zawalski, Michał Tyrolski, Konrad Czechowski, Damian Stachura, Piotr Piękos, Tomasz Odrzygóźdź, Yuhuai Wu, Łukasz Kuciński, Piotr Miłoś

Figure 1 for Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search

Figure 2 for Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search

Figure 3 for Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search

Figure 4 for Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search

Abstract:Complex reasoning problems contain states that vary in the computational cost required to determine a good action plan. Taking advantage of this property, we propose Adaptive Subgoal Search (AdaSubS), a search method that adaptively adjusts the planning horizon. To this end, AdaSubS generates diverse sets of subgoals at different distances. A verification mechanism is employed to filter out unreachable subgoals swiftly and thus allowing to focus on feasible further subgoals. In this way, AdaSubS benefits from the efficiency of planning with longer subgoals and the fine control with the shorter ones. We show that AdaSubS significantly surpasses hierarchical planning algorithms on three complex reasoning tasks: Sokoban, the Rubik's Cube, and inequality proving benchmark INT, setting new state-of-the-art on INT.

Via

Access Paper or Ask Questions

Off-Policy Correction For Multi-Agent Reinforcement Learning

Nov 22, 2021

Michał Zawalski, Błażej Osiński, Henryk Michalewski, Piotr Miłoś

Figure 1 for Off-Policy Correction For Multi-Agent Reinforcement Learning

Figure 2 for Off-Policy Correction For Multi-Agent Reinforcement Learning

Figure 3 for Off-Policy Correction For Multi-Agent Reinforcement Learning

Figure 4 for Off-Policy Correction For Multi-Agent Reinforcement Learning

Abstract:Multi-agent reinforcement learning (MARL) provides a framework for problems involving multiple interacting agents. Despite apparent similarity to the single-agent case, multi-agent problems are often harder to train and analyze theoretically. In this work, we propose MA-Trace, a new on-policy actor-critic algorithm, which extends V-Trace to the MARL setting. The key advantage of our algorithm is its high scalability in a multi-worker setting. To this end, MA-Trace utilizes importance sampling as an off-policy correction method, which allows distributing the computations with no impact on the quality of training. Furthermore, our algorithm is theoretically grounded - we prove a fixed-point theorem that guarantees convergence. We evaluate the algorithm extensively on the StarCraft Multi-Agent Challenge, a standard benchmark for multi-agent algorithms. MA-Trace achieves high performance on all its tasks and exceeds state-of-the-art results on some of them.

Via

Access Paper or Ask Questions

Subgoal Search For Complex Reasoning Tasks

Aug 25, 2021

Konrad Czechowski, Tomasz Odrzygóźdź, Marek Zbysiński, Michał Zawalski, Krzysztof Olejnik, Yuhuai Wu, Łukasz Kuciński, Piotr Miłoś

Figure 1 for Subgoal Search For Complex Reasoning Tasks

Figure 2 for Subgoal Search For Complex Reasoning Tasks

Figure 3 for Subgoal Search For Complex Reasoning Tasks

Figure 4 for Subgoal Search For Complex Reasoning Tasks

Abstract:Humans excel in solving complex reasoning tasks through a mental process of moving from one idea to a related one. Inspired by this, we propose Subgoal Search (kSubS) method. Its key component is a learned subgoal generator that produces a diversity of subgoals that are both achievable and closer to the solution. Using subgoals reduces the search space and induces a high-level search graph suitable for efficient planning. In this paper, we implement kSubS using a transformer-based subgoal module coupled with the classical best-first search framework. We show that a simple approach of generating $k$-th step ahead subgoals is surprisingly efficient on three challenging domains: two popular puzzle games, Sokoban and the Rubik's Cube, and an inequality proving benchmark INT. kSubS achieves strong results including state-of-the-art on INT within a modest computational budget.

Via

Access Paper or Ask Questions