Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic

Aug 29, 2024

Xin Zheng, Jie Lou, Boxi Cao, Xueru Wen, Yuqiu Ji, Hongyu Lin, Yaojie Lu, Xianpei Han, Debing Zhang, Le Sun

Figure 1 for Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic

Figure 2 for Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic

Figure 3 for Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic

Figure 4 for Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic

Share this with someone who'll enjoy it:

Abstract:Self-critic has become an important mechanism for enhancing the reasoning performance of LLMs. However, current approaches mainly involve basic prompts without further training, which tend to be over-simplified, leading to limited accuracy.Moreover, there is a lack of in-depth investigation of the relationship between LLM's ability to criticism and its task-solving performance.To address these issues, we propose Critic-CoT, a novel framework that pushes LLMs toward System-2-like critic capability, via step-wise CoT reasoning format and distant-supervision data construction, without the need for human annotation. Experiments on GSM8K and MATH show that via filtering out invalid solutions or iterative refinement, our enhanced model boosts task-solving performance, which demonstrates the effectiveness of our method. Further, we find that training on critique and refinement alone improves the generation. We hope our work could shed light on future research on improving the reasoning and critic ability of LLMs.

View paper on

Share this with someone who'll enjoy it:

Title:Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic

Paper and Code