Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:From Token to Line: Enhancing Code Generation with a Long-Term Perspective

Apr 10, 2025

Tingwei Lu, Yangning Li, Liyuan Wang, Binghuai Lin, Jiwei Tang, Wanshi Xu, Hai-Tao Zheng, Yinghui Li, Bingxu An, Zhao Wei(+1 more)

Figure 1 for From Token to Line: Enhancing Code Generation with a Long-Term Perspective

Figure 2 for From Token to Line: Enhancing Code Generation with a Long-Term Perspective

Figure 3 for From Token to Line: Enhancing Code Generation with a Long-Term Perspective

Figure 4 for From Token to Line: Enhancing Code Generation with a Long-Term Perspective

Share this with someone who'll enjoy it:

Abstract:The emergence of large language models (LLMs) has significantly promoted the development of code generation task, sparking a surge in pertinent literature. Current research is hindered by redundant generation results and a tendency to overfit local patterns in the short term. Although existing studies attempt to alleviate the issue by adopting a multi-token prediction strategy, there remains limited focus on choosing the appropriate processing length for generations. By analyzing the attention between tokens during the generation process of LLMs, it can be observed that the high spikes of the attention scores typically appear at the end of lines. This insight suggests that it is reasonable to treat each line of code as a fundamental processing unit and generate them sequentially. Inspired by this, we propose the \textbf{LSR-MCTS} algorithm, which leverages MCTS to determine the code line-by-line and select the optimal path. Further, we integrate a self-refine mechanism at each node to enhance diversity and generate higher-quality programs through error correction. Extensive experiments and comprehensive analyses on three public coding benchmarks demonstrate that our method outperforms the state-of-the-art performance approaches.

View paper on

Share this with someone who'll enjoy it:

Title:From Token to Line: Enhancing Code Generation with a Long-Term Perspective

Paper and Code