Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jaejin Kim

PERC: Plan-As-Query Example Retrieval for Underrepresented Code Generation

Dec 17, 2024

Jaeseok Yoo, Hojae Han, Youngwon Lee, Jaejin Kim, Seung-won Hwang

Figure 1 for PERC: Plan-As-Query Example Retrieval for Underrepresented Code Generation

Figure 2 for PERC: Plan-As-Query Example Retrieval for Underrepresented Code Generation

Figure 3 for PERC: Plan-As-Query Example Retrieval for Underrepresented Code Generation

Figure 4 for PERC: Plan-As-Query Example Retrieval for Underrepresented Code Generation

Abstract:Code generation with large language models has shown significant promise, especially when employing retrieval-augmented generation (RAG) with few-shot examples. However, selecting effective examples that enhance generation quality remains a challenging task, particularly when the target programming language (PL) is underrepresented. In this study, we present two key findings: (1) retrieving examples whose presented algorithmic plans can be referenced for generating the desired behavior significantly improves generation accuracy, and (2) converting code into pseudocode effectively captures such algorithmic plans, enhancing retrieval quality even when the source and the target PLs are different. Based on these findings, we propose Plan-as-query Example Retrieval for few-shot prompting in Code generation (PERC), a novel framework that utilizes algorithmic plans to identify and retrieve effective examples. We validate the effectiveness of PERC through extensive experiments on the CodeContests, HumanEval and MultiPL-E benchmarks: PERC consistently outperforms the state-of-the-art RAG methods in code generation, both when the source and target programming languages match or differ, highlighting its adaptability and robustness in diverse coding environments.

* Accepted by COLING 2025 main conference

Via

Access Paper or Ask Questions

ArchCode: Incorporating Software Requirements in Code Generation with Large Language Models

Aug 02, 2024

Hojae Han, Jaejin Kim, Jaeseok Yoo, Youngwon Lee, Seung-won Hwang

Figure 1 for ArchCode: Incorporating Software Requirements in Code Generation with Large Language Models

Figure 2 for ArchCode: Incorporating Software Requirements in Code Generation with Large Language Models

Figure 3 for ArchCode: Incorporating Software Requirements in Code Generation with Large Language Models

Figure 4 for ArchCode: Incorporating Software Requirements in Code Generation with Large Language Models

Abstract:This paper aims to extend the code generation capability of large language models (LLMs) to automatically manage comprehensive software requirements from given textual descriptions. Such requirements include both functional (i.e. achieving expected behavior for inputs) and non-functional (e.g., time/space performance, robustness, maintainability) requirements. However, textual descriptions can either express requirements verbosely or may even omit some of them. We introduce ARCHCODE, a novel framework that leverages in-context learning to organize requirements observed in descriptions and to extrapolate unexpressed requirements from them. ARCHCODE generates requirements from given descriptions, conditioning them to produce code snippets and test cases. Each test case is tailored to one of the requirements, allowing for the ranking of code snippets based on the compliance of their execution results with the requirements. Public benchmarks show that ARCHCODE enhances to satisfy functional requirements, significantly improving Pass@k scores. Furthermore, we introduce HumanEval-NFR, the first evaluation of LLMs' non-functional requirements in code generation, demonstrating ARCHCODE's superiority over baseline methods. The implementation of ARCHCODE and the HumanEval-NFR benchmark are both publicly accessible.

* Accepted by ACL 2024 main conference

Via

Access Paper or Ask Questions