Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:GraphArena: Benchmarking Large Language Models on Graph Computational Problems

Jun 29, 2024

Jianheng Tang, Qifan Zhang, Yuhan Li, Jia Li

Figure 1 for GraphArena: Benchmarking Large Language Models on Graph Computational Problems

Figure 2 for GraphArena: Benchmarking Large Language Models on Graph Computational Problems

Figure 3 for GraphArena: Benchmarking Large Language Models on Graph Computational Problems

Figure 4 for GraphArena: Benchmarking Large Language Models on Graph Computational Problems

Share this with someone who'll enjoy it:

Abstract:The "arms race" of Large Language Models (LLMs) demands novel, challenging, and diverse benchmarks to faithfully examine their progresses. We introduce GraphArena, a benchmarking tool designed to evaluate LLMs on graph computational problems using million-scale real-world graphs from diverse scenarios such as knowledge graphs, social networks, and molecular structures. GraphArena offers a suite of 10 computational tasks, encompassing four polynomial-time (e.g., Shortest Distance) and six NP-complete challenges (e.g., Travelling Salesman Problem). It features a rigorous evaluation framework that classifies LLM outputs as correct, suboptimal (feasible but not optimal), or hallucinatory (properly formatted but infeasible). Evaluation of 10 leading LLMs, including GPT-4o and LLaMA3-70B-Instruct, reveals that even top-performing models struggle with larger, more complex graph problems and exhibit hallucination issues. Despite the application of strategies such as chain-of-thought prompting, these issues remain unresolved. GraphArena contributes a valuable supplement to the existing LLM benchmarks and is open-sourced at https://github.com/squareRoot3/GraphArena.

View paper on

Share this with someone who'll enjoy it:

Title:GraphArena: Benchmarking Large Language Models on Graph Computational Problems

Paper and Code