Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Discriminator-Guided Multi-step Reasoning with Language Models

May 24, 2023

Muhammad Khalifa, Lajanugen Logeswaran, Moontae Lee, Honglak Lee, Lu Wang

Figure 1 for Discriminator-Guided Multi-step Reasoning with Language Models

Figure 2 for Discriminator-Guided Multi-step Reasoning with Language Models

Figure 3 for Discriminator-Guided Multi-step Reasoning with Language Models

Figure 4 for Discriminator-Guided Multi-step Reasoning with Language Models

Share this with someone who'll enjoy it:

Abstract:In the context of multi-step reasoning, language models (LMs) probabilities are often miscalibrated -- solutions with high probabilities are not always correct. Therefore, greedy decoding, which is the standard decoding method for reasoning tasks, often yields incorrect solutions. In addition, methods such as self-consistency and verifiers rely on sampling from the LM distribution and do not tackle the underlying issue. To address this, we introduce Guiding Multi-step ReAsoning with a CorrectnEss Discriminator (GRACE), a stepwise decoding approach that nudges the model towards producing correct reasoning steps. GRACE employs a discriminator model, which is trained to differentiate correct steps from invalid ones, to adjust decoding preferences based on the correctness of each reasoning step. Importantly, GRACE does not require fine-tuning or re-training the LMs. When compared with conventional decoding strategies over four popular math reasoning benchmarks, GRACE exhibits significant improvements in both final answer accuracy and step correctness, outperforming both greedy decoding and self-consistency.\footnote{Our code can be found at \url{https://github.com/mukhal/grace.}}

* 19 pages, 7 figures, and 8 tables

View paper on

Share this with someone who'll enjoy it:

Title:Discriminator-Guided Multi-step Reasoning with Language Models

Paper and Code