Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marco Vieira

VulnLLMEval: A Framework for Evaluating Large Language Models in Software Vulnerability Detection and Patching

Sep 16, 2024

Arastoo Zibaeirad, Marco Vieira

Abstract:Large Language Models (LLMs) have shown promise in tasks like code translation, prompting interest in their potential for automating software vulnerability detection (SVD) and patching (SVP). To further research in this area, establishing a benchmark is essential for evaluating the strengths and limitations of LLMs in these tasks. Despite their capabilities, questions remain regarding whether LLMs can accurately analyze complex vulnerabilities and generate appropriate patches. This paper introduces VulnLLMEval, a framework designed to assess the performance of LLMs in identifying and patching vulnerabilities in C code. Our study includes 307 real-world vulnerabilities extracted from the Linux kernel, creating a well-curated dataset that includes both vulnerable and patched code. This dataset, based on real-world code, provides a diverse and representative testbed for evaluating LLM performance in SVD and SVP tasks, offering a robust foundation for rigorous assessment. Our results reveal that LLMs often struggle with distinguishing between vulnerable and patched code. Furthermore, in SVP tasks, these models tend to oversimplify the code, producing solutions that may not be directly usable without further refinement.

Via

Access Paper or Ask Questions

Propheticus: Generalizable Machine Learning Framework

Sep 06, 2018

João R. Campos, Marco Vieira, Ernesto Costa

Abstract:Due to recent technological developments, Machine Learning (ML), a subfield of Artificial Intelligence (AI), has been successfully used to process and extract knowledge from a variety of complex problems. However, a thorough ML approach is complex and highly dependent on the problem at hand. Additionally, implementing the logic required to execute the experiments is no small nor trivial deed, consequentially increasing the probability of faulty code which can compromise the results. Propheticus is a data-driven framework which results of the need for a tool that abstracts some of the inherent complexity of ML, whilst being easy to understand and use, as well as to adapt and expand to assist the user's specific needs. Propheticus systematizes and enforces various complex concepts of an ML experiment workflow, taking into account the nature of both the problem and the data. It contains functionalities to execute all the different tasks, from data preprocessing, to results analysis and comparison. Notwithstanding, it can be fairly easily adapted to different problems due to its flexible architecture, and customized as needed to address the user's needs.

Via

Access Paper or Ask Questions