Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:ARIES: Autonomous Reasoning with LLMs on Interactive Thought Graph Environments

Feb 28, 2025

Pedro Gimenes, Zeyu Cao, Jeffrey Wong, Yiren Zhao

Figure 1 for ARIES: Autonomous Reasoning with LLMs on Interactive Thought Graph Environments

Figure 2 for ARIES: Autonomous Reasoning with LLMs on Interactive Thought Graph Environments

Figure 3 for ARIES: Autonomous Reasoning with LLMs on Interactive Thought Graph Environments

Figure 4 for ARIES: Autonomous Reasoning with LLMs on Interactive Thought Graph Environments

Share this with someone who'll enjoy it:

Abstract:Recent research has shown that LLM performance on reasoning tasks can be enhanced by scaling test-time compute. One promising approach, particularly with decomposable problems, involves arranging intermediate solutions as a graph on which transformations are performed to explore the solution space. However, prior works rely on pre-determined, task-specific transformation schedules which are subject to a set of searched hyperparameters. In this work, we view thought graph transformations as actions in a Markov decision process, and implement policy agents to drive effective action policies for the underlying reasoning LLM agent. In particular, we investigate the ability for another LLM to act as a policy agent on thought graph environments and introduce ARIES, a multi-agent architecture for reasoning with LLMs. In ARIES, reasoning LLM agents solve decomposed subproblems, while policy LLM agents maintain visibility of the thought graph states, and dynamically adapt the problem-solving strategy. Through extensive experiments, we observe that using off-the-shelf LLMs as policy agents with no supervised fine-tuning (SFT) can yield up to $29\%$ higher accuracy on HumanEval relative to static transformation schedules, as well as reducing inference costs by $35\%$ and avoid any search requirements. We also conduct a thorough analysis of observed failure modes, highlighting that limitations on LLM sizes and the depth of problem decomposition can be seen as challenges to scaling LLM-guided reasoning.

View paper on

Share this with someone who'll enjoy it:

Title:ARIES: Autonomous Reasoning with LLMs on Interactive Thought Graph Environments

Paper and Code