Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vinod P

XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs

Apr 30, 2025

Marco Arazzi, Vignesh Kumar Kembu, Antonino Nocera, Vinod P

Figure 1 for XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs

Figure 2 for XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs

Figure 3 for XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs

Figure 4 for XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs

Abstract:Large Language Models are fundamental actors in the modern IT landscape dominated by AI solutions. However, security threats associated with them might prevent their reliable adoption in critical application scenarios such as government organizations and medical institutions. For this reason, commercial LLMs typically undergo a sophisticated censoring mechanism to eliminate any harmful output they could possibly produce. In response to this, LLM Jailbreaking is a significant threat to such protections, and many previous approaches have already demonstrated its effectiveness across diverse domains. Existing jailbreak proposals mostly adopt a generate-and-test strategy to craft malicious input. To improve the comprehension of censoring mechanisms and design a targeted jailbreak attack, we propose an Explainable-AI solution that comparatively analyzes the behavior of censored and uncensored models to derive unique exploitable alignment patterns. Then, we propose XBreaking, a novel jailbreak attack that exploits these unique patterns to break the security constraints of LLMs by targeted noise injection. Our thorough experimental campaign returns important insights about the censoring mechanisms and demonstrates the effectiveness and performance of our attack.

Via

Access Paper or Ask Questions

Privacy-Preserving in Blockchain-based Federated Learning Systems

Jan 07, 2024

Sameera K. M., Serena Nicolazzo, Marco Arazzi, Antonino Nocera, Rafidha Rehiman K. A., Vinod P, Mauro Conti

Figure 1 for Privacy-Preserving in Blockchain-based Federated Learning Systems

Figure 2 for Privacy-Preserving in Blockchain-based Federated Learning Systems

Figure 3 for Privacy-Preserving in Blockchain-based Federated Learning Systems

Figure 4 for Privacy-Preserving in Blockchain-based Federated Learning Systems

Abstract:Federated Learning (FL) has recently arisen as a revolutionary approach to collaborative training Machine Learning models. According to this novel framework, multiple participants train a global model collaboratively, coordinating with a central aggregator without sharing their local data. As FL gains popularity in diverse domains, security, and privacy concerns arise due to the distributed nature of this solution. Therefore, integrating this strategy with Blockchain technology has been consolidated as a preferred choice to ensure the privacy and security of participants. This paper explores the research efforts carried out by the scientific community to define privacy solutions in scenarios adopting Blockchain-Enabled FL. It comprehensively summarizes the background related to FL and Blockchain, evaluates existing architectures for their integration, and the primary attacks and possible countermeasures to guarantee privacy in this setting. Finally, it reviews the main application scenarios where Blockchain-Enabled FL approaches have been proficiently applied. This survey can help academia and industry practitioners understand which theories and techniques exist to improve the performance of FL through Blockchain to preserve privacy and which are the main challenges and future directions in this novel and still under-explored context. We believe this work provides a novel contribution respect to the previous surveys and is a valuable tool to explore the current landscape, understand perspectives, and pave the way for advancements or improvements in this amalgamation of Blockchain and Federated Learning.

* 44 pages, 11 figures

Via

Access Paper or Ask Questions

GANG-MAM: GAN based enGine for Modifying Android Malware

Sep 27, 2021

Renjith G, Sonia Laudanna, Aji S, Corrado Aaron Visaggio, Vinod P

Figure 1 for GANG-MAM: GAN based enGine for Modifying Android Malware

Figure 2 for GANG-MAM: GAN based enGine for Modifying Android Malware

Figure 3 for GANG-MAM: GAN based enGine for Modifying Android Malware

Figure 4 for GANG-MAM: GAN based enGine for Modifying Android Malware

Abstract:Malware detectors based on machine learning are vulnerable to adversarial attacks. Generative Adversarial Networks (GAN) are architectures based on Neural Networks that could produce successful adversarial samples. The interest towards this technology is quickly growing. In this paper, we propose a system that produces a feature vector for making an Android malware strongly evasive and then modify the malicious program accordingly. Such a system could have a twofold contribution: it could be used to generate datasets to validate systems for detecting GAN-based malware and to enlarge the training and testing dataset for making more robust malware classifiers.

Via

Access Paper or Ask Questions

Can Machine Learning Model with Static Features be Fooled: an Adversarial Machine Learning Approach

Apr 20, 2019

Rahim Taheri, Reza Javidan, Mohammad Shojafar, Vinod P, Mauro Conti

Figure 1 for Can Machine Learning Model with Static Features be Fooled: an Adversarial Machine Learning Approach

Figure 2 for Can Machine Learning Model with Static Features be Fooled: an Adversarial Machine Learning Approach

Figure 3 for Can Machine Learning Model with Static Features be Fooled: an Adversarial Machine Learning Approach

Figure 4 for Can Machine Learning Model with Static Features be Fooled: an Adversarial Machine Learning Approach

Abstract:The widespread adoption of smartphones dramatically increases the risk of attacks and the spread of mobile malware, especially on the Android platform. Machine learning based solutions have been already used as a tool to supersede signature based anti-malware systems. However, malware authors leverage attributes from malicious and legitimate samples to estimate statistical difference in-order to create adversarial examples. Hence, to evaluate the vulnerability of machine learning algorithms in malware detection, we propose five different attack scenarios to perturb malicious applications (apps). By doing this, the classification algorithm inappropriately fits discriminant function on the set of data points, eventually yielding a higher misclassification rate. Further, to distinguish the adversarial examples from benign samples, we propose two defense mechanisms to counter attacks. To validate our attacks and solutions, we test our model on three different benchmark datasets. We also test our methods using various classifier algorithms and compare them with the state-of-the-art data poisoning method using the Jacobian matrix. Promising results show that generated adversarial samples can evade detection with a very high probability. Additionally, evasive variants generated by our attacks models when used to harden the developed anti-malware system improves the detection rate.

* 20 pages, 6 figures, 5 tables

Via

Access Paper or Ask Questions

FeatureAnalytics: An approach to derive relevant attributes for analyzing Android Malware

Sep 17, 2018

Deepa K, Radhamani G, Vinod P, Mohammad Shojafar, Neeraj Kumar, Mauro Conti

Figure 1 for FeatureAnalytics: An approach to derive relevant attributes for analyzing Android Malware

Figure 2 for FeatureAnalytics: An approach to derive relevant attributes for analyzing Android Malware

Figure 3 for FeatureAnalytics: An approach to derive relevant attributes for analyzing Android Malware

Figure 4 for FeatureAnalytics: An approach to derive relevant attributes for analyzing Android Malware

Abstract:Ever increasing number of Android malware, has always been a concern for cybersecurity professionals. Even though plenty of anti-malware solutions exist, a rational and pragmatic approach for the same is rare and has to be inspected further. In this paper, we propose a novel two-set feature selection approach based on Rough Set and Statistical Test named as RSST to extract relevant system calls. To address the problem of higher dimensional attribute set, we derived suboptimal system call space by applying the proposed feature selection method to maximize the separability between malware and benign samples. Comprehensive experiments conducted on a dataset consisting of 3500 samples with 30 RSST derived essential system calls resulted in an accuracy of 99.9%, Area Under Curve (AUC) of 1.0, with 1% False Positive Rate (FPR). However, other feature selectors (Information Gain, CFsSubsetEval, ChiSquare, FreqSel and Symmetric Uncertainty) used in the domain of malware analysis resulted in the accuracy of 95.5% with 8.5% FPR. Besides, empirical analysis of RSST derived system calls outperform other attributes such as permissions, opcodes, API, methods, call graphs, Droidbox attributes and network traces.

* 26 pages, 6 figures, 9 tables, Journal Submission

Via

Access Paper or Ask Questions