Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

Aug 07, 2024

Xiangyan Liu, Bo Lan, Zhiyuan Hu, Yang Liu, Zhicheng Zhang, Wenmeng Zhou, Fei Wang, Michael Shieh

Figure 1 for CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

Figure 2 for CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

Figure 3 for CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

Figure 4 for CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

Share this with someone who'll enjoy it:

Abstract:Large Language Models (LLMs) excel in stand-alone code tasks like HumanEval and MBPP, but struggle with handling entire code repositories. This challenge has prompted research on enhancing LLM-codebase interaction at a repository scale. Current solutions rely on similarity-based retrieval or manual tools and APIs, each with notable drawbacks. Similarity-based retrieval often has low recall in complex tasks, while manual tools and APIs are typically task-specific and require expert knowledge, reducing their generalizability across diverse code tasks and real-world applications. To mitigate these limitations, we introduce \framework, a system that integrates LLM agents with graph database interfaces extracted from code repositories. By leveraging the structural properties of graph databases and the flexibility of the graph query language, \framework enables the LLM agent to construct and execute queries, allowing for precise, code structure-aware context retrieval and code navigation. We assess \framework using three benchmarks: CrossCodeEval, SWE-bench, and EvoCodeBench. Additionally, we develop five real-world coding applications. With a unified graph database schema, \framework demonstrates competitive performance and potential in both academic and real-world environments, showcasing its versatility and efficacy in software engineering. Our application demo: https://github.com/modelscope/modelscope-agent/tree/master/apps/codexgraph_agent.

* work in progress

View paper on

Share this with someone who'll enjoy it:

Title:CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

Paper and Code