Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dario Pasquini

Hacking Back the AI-Hacker: Prompt Injection as a Defense Against LLM-driven Cyberattacks

Oct 28, 2024

Dario Pasquini, Evgenios M. Kornaropoulos, Giuseppe Ateniese

Abstract:Large language models (LLMs) are increasingly being harnessed to automate cyberattacks, making sophisticated exploits more accessible and scalable. In response, we propose a new defense strategy tailored to counter LLM-driven cyberattacks. We introduce Mantis, a defensive framework that exploits LLMs' susceptibility to adversarial inputs to undermine malicious operations. Upon detecting an automated cyberattack, Mantis plants carefully crafted inputs into system responses, leading the attacker's LLM to disrupt their own operations (passive defense) or even compromise the attacker's machine (active defense). By deploying purposefully vulnerable decoy services to attract the attacker and using dynamic prompt injections for the attacker's LLM, Mantis can autonomously hack back the attacker. In our experiments, Mantis consistently achieved over 95% effectiveness against automated LLM-driven attacks. To foster further research and collaboration, Mantis is available as an open-source tool: https://github.com/pasquini-dario/project_mantis

* v0.1

Via

Access Paper or Ask Questions

LLMmap: Fingerprinting For Large Language Models

Jul 24, 2024

Dario Pasquini, Evgenios M. Kornaropoulos, Giuseppe Ateniese

Figure 1 for LLMmap: Fingerprinting For Large Language Models

Figure 2 for LLMmap: Fingerprinting For Large Language Models

Figure 3 for LLMmap: Fingerprinting For Large Language Models

Figure 4 for LLMmap: Fingerprinting For Large Language Models

Abstract:We introduce LLMmap, a first-generation fingerprinting attack targeted at LLM-integrated applications. LLMmap employs an active fingerprinting approach, sending carefully crafted queries to the application and analyzing the responses to identify the specific LLM model in use. With as few as 8 interactions, LLMmap can accurately identify LLMs with over 95% accuracy. More importantly, LLMmap is designed to be robust across different application layers, allowing it to identify LLMs operating under various system prompts, stochastic sampling hyperparameters, and even complex generation frameworks such as RAG or Chain-of-Thought.

* version 0.1 (added missing refs)

Via

Access Paper or Ask Questions

Neural Exec: Learning (and Learning from) Execution Triggers for Prompt Injection Attacks

Mar 06, 2024

Dario Pasquini, Martin Strohmeier, Carmela Troncoso

Figure 1 for Neural Exec: Learning (and Learning from) Execution Triggers for Prompt Injection Attacks

Figure 2 for Neural Exec: Learning (and Learning from) Execution Triggers for Prompt Injection Attacks

Figure 3 for Neural Exec: Learning (and Learning from) Execution Triggers for Prompt Injection Attacks

Figure 4 for Neural Exec: Learning (and Learning from) Execution Triggers for Prompt Injection Attacks

Abstract:We introduce a new family of prompt injection attacks, termed Neural Exec. Unlike known attacks that rely on handcrafted strings (e.g., "Ignore previous instructions and..."), we show that it is possible to conceptualize the creation of execution triggers as a differentiable search problem and use learning-based methods to autonomously generate them. Our results demonstrate that a motivated adversary can forge triggers that are not only drastically more effective than current handcrafted ones but also exhibit inherent flexibility in shape, properties, and functionality. In this direction, we show that an attacker can design and generate Neural Execs capable of persisting through multi-stage preprocessing pipelines, such as in the case of Retrieval-Augmented Generation (RAG)-based applications. More critically, our findings show that attackers can produce triggers that deviate markedly in form and shape from any known attack, sidestepping existing blacklist-based detection and sanitation approaches.

* v0.1

Via

Access Paper or Ask Questions

Can Decentralized Learning be more robust than Federated Learning?

Mar 07, 2023

Mathilde Raynal, Dario Pasquini, Carmela Troncoso

Abstract:Decentralized Learning (DL) is a peer--to--peer learning approach that allows a group of users to jointly train a machine learning model. To ensure correctness, DL should be robust, i.e., Byzantine users must not be able to tamper with the result of the collaboration. In this paper, we introduce two \textit{new} attacks against DL where a Byzantine user can: make the network converge to an arbitrary model of their choice, and exclude an arbitrary user from the learning process. We demonstrate our attacks' efficiency against Self--Centered Clipping, the state--of--the--art robust DL protocol. Finally, we show that the capabilities decentralization grants to Byzantine users result in decentralized learning \emph{always} providing less robustness than federated learning.

Via

Access Paper or Ask Questions

Universal Neural-Cracking-Machines: Self-Configurable Password Models from Auxiliary Data

Jan 18, 2023

Dario Pasquini, Giuseppe Ateniese, Carmela Troncoso

Figure 1 for Universal Neural-Cracking-Machines: Self-Configurable Password Models from Auxiliary Data

Figure 2 for Universal Neural-Cracking-Machines: Self-Configurable Password Models from Auxiliary Data

Figure 3 for Universal Neural-Cracking-Machines: Self-Configurable Password Models from Auxiliary Data

Figure 4 for Universal Neural-Cracking-Machines: Self-Configurable Password Models from Auxiliary Data

Abstract:We develop the first universal password model -- a password model that, once pre-trained, can automatically adapt to any password distribution. To achieve this result, the model does not need to access any plaintext passwords from the target set. Instead, it exploits users' auxiliary information, such as email addresses, as a proxy signal to predict the underlying target password distribution. The model uses deep learning to capture the correlation between the auxiliary data of a group of users (e.g., users of a web application) and their passwords. It then exploits those patterns to create a tailored password model for the target community at inference time. No further training steps, targeted data collection, or prior knowledge of the community's password distribution is required. Besides defining a new state-of-the-art for password strength estimation, our model enables any end-user (e.g., system administrators) to autonomously generate tailored password models for their systems without the often unworkable requirement of collecting suitable training data and fitting the underlying password model. Ultimately, our framework enables the democratization of well-calibrated password models to the community, addressing a major challenge in the deployment of password security solutions on a large scale.

* v0.01

Via

Access Paper or Ask Questions

On the Privacy of Decentralized Machine Learning

May 17, 2022

Dario Pasquini, Mathilde Raynal, Carmela Troncoso

Figure 1 for On the Privacy of Decentralized Machine Learning

Figure 2 for On the Privacy of Decentralized Machine Learning

Figure 3 for On the Privacy of Decentralized Machine Learning

Figure 4 for On the Privacy of Decentralized Machine Learning

Abstract:In this work, we carry out the first, in-depth, privacy analysis of Decentralized Learning -- a collaborative machine learning framework aimed at circumventing the main limitations of federated learning. We identify the decentralized learning properties that affect users' privacy and we introduce a suite of novel attacks for both passive and active decentralized adversaries. We demonstrate that, contrary to what is claimed by decentralized learning proposers, decentralized learning does not offer any security advantages over more practical approaches such as federated learning. Rather, it tends to degrade users' privacy by increasing the attack surface and enabling any user in the system to perform powerful privacy attacks such as gradient inversion, and even gain full control over honest users' local model. We also reveal that, given the state of the art in protections, privacy-preserving configurations of decentralized learning require abandoning any possible advantage over the federated setup, completely defeating the objective of the decentralized approach.

* 17 pages

Via

Access Paper or Ask Questions

Eluding Secure Aggregation in Federated Learning via Model Inconsistency

Nov 14, 2021

Dario Pasquini, Danilo Francati, Giuseppe Ateniese

Figure 1 for Eluding Secure Aggregation in Federated Learning via Model Inconsistency

Figure 2 for Eluding Secure Aggregation in Federated Learning via Model Inconsistency

Figure 3 for Eluding Secure Aggregation in Federated Learning via Model Inconsistency

Figure 4 for Eluding Secure Aggregation in Federated Learning via Model Inconsistency

Abstract:Federated learning allows a set of users to train a deep neural network over their private training datasets. During the protocol, datasets never leave the devices of the respective users. This is achieved by requiring each user to send "only" model updates to a central server that, in turn, aggregates them to update the parameters of the deep neural network. However, it has been shown that each model update carries sensitive information about the user's dataset (e.g., gradient inversion attacks). The state-of-the-art implementations of federated learning protect these model updates by leveraging secure aggregation: A cryptographic protocol that securely computes the aggregation of the model updates of the users. Secure aggregation is pivotal to protect users' privacy since it hinders the server from learning the value and the source of the individual model updates provided by the users, preventing inference and data attribution attacks. In this work, we show that a malicious server can easily elude secure aggregation as if the latter were not in place. We devise two different attacks capable of inferring information on individual private training datasets, independently of the number of users participating in the secure aggregation. This makes them concrete threats in large-scale, real-world federated learning applications. The attacks are generic and do not target any specific secure aggregation protocol. They are equally effective even if the secure aggregation protocol is replaced by its ideal functionality that provides the perfect level of security. Our work demonstrates that secure aggregation has been incorrectly combined with federated learning and that current implementations offer only a "false sense of security".

* Work in progress

Via

Access Paper or Ask Questions

Unleashing the Tiger: Inference Attacks on Split Learning

Dec 04, 2020

Dario Pasquini, Giuseppe Ateniese, Massimo Bernaschi

Figure 1 for Unleashing the Tiger: Inference Attacks on Split Learning

Figure 2 for Unleashing the Tiger: Inference Attacks on Split Learning

Figure 3 for Unleashing the Tiger: Inference Attacks on Split Learning

Figure 4 for Unleashing the Tiger: Inference Attacks on Split Learning

Abstract:We investigate the security of split learning -- a novel collaborative machine learning framework that enables peak performance by requiring minimal resources consumption. In the paper, we make explicit the vulnerabilities of the protocol and demonstrate its inherent insecurity by introducing general attack strategies targeting the reconstruction of clients' private training sets. More prominently, we demonstrate that a malicious server can actively hijack the learning process of the distributed model and bring it into an insecure state that enables inference attacks on clients' data. We implement different adaptations of the attack and test them on various datasets as well as within realistic threat scenarios. To make our results reproducible, we made our code available at https://github.com/pasquini-dario/SplitNN_FSHA.

* 11 pages. Work in progress

Via

Access Paper or Ask Questions

Reducing Bias in Modeling Real-world Password Strength via Deep Learning and Dynamic Dictionaries

Oct 26, 2020

Dario Pasquini, Marco Cianfriglia, Giuseppe Ateniese, Massimo Bernaschi

Figure 1 for Reducing Bias in Modeling Real-world Password Strength via Deep Learning and Dynamic Dictionaries

Figure 2 for Reducing Bias in Modeling Real-world Password Strength via Deep Learning and Dynamic Dictionaries

Figure 3 for Reducing Bias in Modeling Real-world Password Strength via Deep Learning and Dynamic Dictionaries

Figure 4 for Reducing Bias in Modeling Real-world Password Strength via Deep Learning and Dynamic Dictionaries

Abstract:Password security hinges on an accurate understanding of the techniques adopted by attackers. However, current studies mostly rely on probabilistic password models that are imperfect proxies of real-world guessing strategies. The main reason is that attackers rely on very pragmatic approaches such as dictionary attacks. Unfortunately, it is inherently difficult to correctly model those methods. To be representative, dictionary attacks must be thoughtfully configured according to a process that requires an expertise that cannot be easily replicated in password studies. The consequence of inaccurately calibrating those attacks is the unreliability of password security estimates, impaired by measurement bias. In the present work, we introduce new guessing techniques that make dictionary attacks consistently more resilient to inadequate configurations. Our framework allows dictionary attacks to self-heal and converge towards optimal attacks' performance, requiring no supervision or domain-knowledge. To achieve this: (1) We use a deep neural network to model and then simulate the proficiency of expert adversaries. (2) Then, we introduce automatic dynamic strategies within dictionary attacks to mimic experts' ability to adapt their guessing strategies on the fly by incorporating knowledge on their targets. Our techniques enable robust and sound password strength estimates, eventually reducing bias in modeling real-world threats in password security.

Via

Access Paper or Ask Questions

Interpretable Probabilistic Password Strength Meters via Deep Learning

Apr 29, 2020

Dario Pasquini, Giuseppe Ateniese, Massimo Bernaschi

Figure 1 for Interpretable Probabilistic Password Strength Meters via Deep Learning

Figure 2 for Interpretable Probabilistic Password Strength Meters via Deep Learning

Figure 3 for Interpretable Probabilistic Password Strength Meters via Deep Learning

Figure 4 for Interpretable Probabilistic Password Strength Meters via Deep Learning

Abstract:Probabilistic password strength meters have been proved to be the most accurate tools to measure password strength. Unfortunately, by construction, they are limited to solely produce an opaque security estimation that fails to fully support the user during the password composition. In the present work, we move the first steps towards cracking the intelligibility barrier of this compelling class of meters. We show that probabilistic password meters inherently own the capability of describing the latent relation occurring between password strength and password structure. In our approach, the security contribution of each character composing a password is disentangled and used to provide explicit fine-grained feedback for the user. Furthermore, unlike existing heuristic constructions, our method is free from any human bias, and, more importantly, its feedback has a clear probabilistic interpretation. In our contribution: (1) we formulate the theoretical foundations of interpretable probabilistic password strength meters; (2) we describe how they can be implemented via an efficient and lightweight deep learning framework suitable for client-side operability.

Via

Access Paper or Ask Questions