Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jierui Li

CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models

Nov 07, 2024

Jierui Li, Hung Le, Yinbo Zhou, Caiming Xiong, Silvio Savarese, Doyen Sahoo

Figure 1 for CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models

Figure 2 for CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models

Figure 3 for CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models

Figure 4 for CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models

Abstract:Pre-trained on massive amounts of code and text data, large language models (LLMs) have demonstrated remarkable achievements in performing code generation tasks. With additional execution-based feedback, these models can act as agents with capabilities to self-refine and improve generated code autonomously. However, on challenging coding tasks with extremely large search space, current agentic approaches still struggle with multi-stage planning, generating, and debugging. To address this problem, we propose CodeTree, a framework for LLM agents to efficiently explore the search space in different stages of the code generation process. Specifically, we adopted a unified tree structure to explicitly explore different coding strategies, generate corresponding coding solutions, and subsequently refine the solutions. In each stage, critical decision-making (ranking, termination, expanding) of the exploration process is guided by both the environmental execution-based feedback and LLM-agent-generated feedback. We comprehensively evaluated CodeTree on 7 code generation benchmarks and demonstrated the significant performance gains of CodeTree against strong baselines. Using GPT-4o as the base model, we consistently achieved top results of 95.1 on HumanEval, 98.7 on MBPP, and 43.0 on CodeContests. On the challenging SWEBench benchmark, our approach led to significant performance gains.

Via

Access Paper or Ask Questions

Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs

Apr 11, 2024

Jierui Li, Raymond Mooney

Figure 1 for Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs

Figure 2 for Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs

Figure 3 for Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs

Figure 4 for Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs

Abstract:Distilling explicit chain-of-thought reasoning paths has emerged as an effective method for improving the reasoning abilities of large language models (LLMs) across various tasks. However, when tackling complex tasks that pose significant challenges for state-of-the-art models, this technique often struggles to produce effective chains of thought that lead to correct answers. In this work, we propose a novel approach to distill reasoning abilities from LLMs by leveraging their capacity to explain solutions. We apply our method to solving competitive-level programming challenges. More specifically, we employ an LLM to generate explanations for a set of <problem, solution-program> pairs, then use <problem, explanation> pairs to fine-tune a smaller language model, which we refer to as the Reasoner, to learn algorithmic reasoning that can generate "how-to-solve" hints for unseen problems. Our experiments demonstrate that learning from explanations enables the Reasoner to more effectively guide program implementation by a Coder, resulting in higher solve rates than strong chain-of-thought baselines on competitive-level programming problems. It also outperforms models that learn directly from <problem, solution-program> pairs. We curated an additional test set in the CodeContests format, which includes 246 more recent problems posted after the models' knowledge cutoff.

* pre-print

Via

Access Paper or Ask Questions

ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models

Nov 15, 2023

Jierui Li, Vipul Raheja, Dhruv Kumar

Figure 1 for ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models

Figure 2 for ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models

Figure 3 for ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models

Figure 4 for ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models

Abstract:In recent times, large language models (LLMs) have shown impressive performance on various document-level tasks such as document classification, summarization, and question-answering. However, research on understanding their capabilities on the task of self-contradictions in long documents has been very limited. In this work, we introduce ContraDoc, the first human-annotated dataset to study self-contradictions in long documents across multiple domains, varying document lengths, self-contradictions types, and scope. We then analyze the current capabilities of four state-of-the-art open-source and commercially available LLMs: GPT3.5, GPT4, PaLM2, and LLaMAv2 on this dataset. While GPT4 performs the best and can outperform humans on this task, we find that it is still unreliable and struggles with self-contradictions that require more nuance and context. We release the dataset and all the code associated with the experiments.

Via

Access Paper or Ask Questions

Explaining Competitive-Level Programming Solutions using LLMs

Jul 11, 2023

Jierui Li, Szymon Tworkowski, Yingying Wu, Raymond Mooney

Figure 1 for Explaining Competitive-Level Programming Solutions using LLMs

Figure 2 for Explaining Competitive-Level Programming Solutions using LLMs

Figure 3 for Explaining Competitive-Level Programming Solutions using LLMs

Figure 4 for Explaining Competitive-Level Programming Solutions using LLMs

Abstract:In this paper, we approach competitive-level programming problem-solving as a composite task of reasoning and code generation. We propose a novel method to automatically annotate natural language explanations to \textit{<problem, solution>} pairs. We show that despite poor performance in solving competitive-level programming problems, state-of-the-art LLMs exhibit a strong capacity in describing and explaining solutions. Our explanation generation methodology can generate a structured solution explanation for the problem containing descriptions and analysis. To evaluate the quality of the annotated explanations, we examine their effectiveness in two aspects: 1) satisfying the human programming expert who authored the oracle solution, and 2) aiding LLMs in solving problems more effectively. The experimental results on the CodeContests dataset demonstrate that while LLM GPT3.5's and GPT-4's abilities in describing the solution are comparable, GPT-4 shows a better understanding of the key idea behind the solution.

* 14 pages, presented at the 1st NLRSE workshop

Via

Access Paper or Ask Questions

Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction

Mar 19, 2022

Zhanming Jie, Jierui Li, Wei Lu

Figure 1 for Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction

Figure 2 for Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction

Figure 3 for Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction

Figure 4 for Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction

Abstract:Solving math word problems requires deductive reasoning over the quantities in the text. Various recent research efforts mostly relied on sequence-to-sequence or sequence-to-tree models to generate mathematical expressions without explicitly performing relational reasoning between quantities in the given context. While empirically effective, such approaches typically do not provide explanations for the generated expressions. In this work, we view the task as a complex relation extraction problem, proposing a novel approach that presents explainable deductive reasoning steps to iteratively construct target expressions, where each step involves a primitive operation over two quantities defining their relation. Through extensive experiments on four benchmark datasets, we show that the proposed model significantly outperforms existing strong baselines. We further demonstrate that the deductive procedure not only presents more explainable steps but also enables us to make more accurate predictions on questions that require more complex reasoning.

* 12 pages, 7 figures, Accepted by ACL-2022 main conference as a long paper

Via

Access Paper or Ask Questions

Evaluating Explanation Methods for Neural Machine Translation

May 04, 2020

Jierui Li, Lemao Liu, Huayang Li, Guanlin Li, Guoping Huang, Shuming Shi

Figure 1 for Evaluating Explanation Methods for Neural Machine Translation

Figure 2 for Evaluating Explanation Methods for Neural Machine Translation

Figure 3 for Evaluating Explanation Methods for Neural Machine Translation

Figure 4 for Evaluating Explanation Methods for Neural Machine Translation

Abstract:Recently many efforts have been devoted to interpreting the black-box NMT models, but little progress has been made on metrics to evaluate explanation methods. Word Alignment Error Rate can be used as such a metric that matches human understanding, however, it can not measure explanation methods on those target words that are not aligned to any source word. This paper thereby makes an initial attempt to evaluate explanation methods from an alternative viewpoint. To this end, it proposes a principled metric based on fidelity in regard to the predictive behavior of the NMT model. As the exact computation for this metric is intractable, we employ an efficient approach as its approximation. On six standard translation tasks, we quantitatively evaluate several explanation methods in terms of the proposed metric and we reveal some valuable findings for these explanation methods in our experiments.

* Accepted to ACL 2020, 9 pages

Via

Access Paper or Ask Questions