Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dan Ristea

One Pic is All it Takes: Poisoning Visual Document Retrieval Augmented Generation with a Single Image

Apr 02, 2025

Ezzeldin Shereen, Dan Ristea, Burak Hasircioglu, Shae McFadden, Vasilios Mavroudis, Chris Hicks

Abstract:Multimodal retrieval augmented generation (M-RAG) has recently emerged as a method to inhibit hallucinations of large multimodal models (LMMs) through a factual knowledge base (KB). However, M-RAG also introduces new attack vectors for adversaries that aim to disrupt the system by injecting malicious entries into the KB. In this work, we present a poisoning attack against M-RAG targeting visual document retrieval applications, where the KB contains images of document pages. Our objective is to craft a single image that is retrieved for a variety of different user queries, and consistently influences the output produced by the generative model, thus creating a universal denial-of-service (DoS) attack against the M-RAG system. We demonstrate that while our attack is effective against a diverse range of widely-used, state-of-the-art retrievers (embedding models) and generators (LMMs), it can also be ineffective against robust embedding models. Our attack not only highlights the vulnerability of M-RAG pipelines to poisoning attacks, but also sheds light on a fundamental weakness that potentially hinders their performance even in benign settings.

* 8 pages, 6 figures

Via

Access Paper or Ask Questions

SoK: On Closing the Applicability Gap in Automated Vulnerability Detection

Dec 15, 2024

Ezzeldin Shereen, Dan Ristea, Sanyam Vyas, Shae McFadden, Madeleine Dwyer, Chris Hicks, Vasilios Mavroudis

Abstract:The frequent discovery of security vulnerabilities in both open-source and proprietary software underscores the urgent need for earlier detection during the development lifecycle. Initiatives such as DARPA's Artificial Intelligence Cyber Challenge (AIxCC) aim to accelerate Automated Vulnerability Detection (AVD), seeking to address this challenge by autonomously analyzing source code to identify vulnerabilities. This paper addresses two primary research questions: (RQ1) How is current AVD research distributed across its core components? (RQ2) What key areas should future research target to bridge the gap in the practical applicability of AVD throughout software development? To answer these questions, we conduct a systematization over 79 AVD articles and 17 empirical studies, analyzing them across five core components: task formulation and granularity, input programming languages and representations, detection approaches and key solutions, evaluation metrics and datasets, and reported performance. Our systematization reveals that the narrow focus of AVD research-mainly on specific tasks and programming languages-limits its practical impact and overlooks broader areas crucial for effective, real-world vulnerability detection. We identify significant challenges, including the need for diversified problem formulations, varied detection granularities, broader language support, better dataset quality, enhanced reproducibility, and increased practical impact. Based on these findings we identify research directions that will enhance the effectiveness and applicability of AVD solutions in software security.

Via

Access Paper or Ask Questions

Benchmarking OpenAI o1 in Cyber Security

Oct 29, 2024

Dan Ristea, Vasilios Mavroudis, Chris Hicks

Figure 1 for Benchmarking OpenAI o1 in Cyber Security

Figure 2 for Benchmarking OpenAI o1 in Cyber Security

Figure 3 for Benchmarking OpenAI o1 in Cyber Security

Figure 4 for Benchmarking OpenAI o1 in Cyber Security

Abstract:We evaluate OpenAI's o1-preview and o1-mini models, benchmarking their performance against the earlier GPT-4o model. Our evaluation focuses on their ability to detect vulnerabilities in real-world software by generating structured inputs that trigger known sanitizers. Using DARPA's AI Cyber Challenge (AIxCC) framework and the Nginx challenge project--a deliberately modified version of the widely-used Nginx web server--we create a well-defined yet complex environment for testing LLMs on automated vulnerability detection (AVD) tasks. Our results show that the o1-preview model significantly outperforms GPT-4o in both success rate and efficiency, especially in more complex scenarios.

Via

Access Paper or Ask Questions