Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Panos Papadimitratos

Beyond Context: Large Language Models Failure to Grasp Users Intent

Dec 24, 2025

Ahmed M. Hussain, Salahuddin Salahuddin, Panos Papadimitratos

Abstract:Current Large Language Models (LLMs) safety approaches focus on explicitly harmful content while overlooking a critical vulnerability: the inability to understand context and recognize user intent. This creates exploitable vulnerabilities that malicious users can systematically leverage to circumvent safety mechanisms. We empirically evaluate multiple state-of-the-art LLMs, including ChatGPT, Claude, Gemini, and DeepSeek. Our analysis demonstrates the circumvention of reliable safety mechanisms through emotional framing, progressive revelation, and academic justification techniques. Notably, reasoning-enabled configurations amplified rather than mitigated the effectiveness of exploitation, increasing factual precision while failing to interrogate the underlying intent. The exception was Claude Opus 4.1, which prioritized intent detection over information provision in some use cases. This pattern reveals that current architectural designs create systematic vulnerabilities. These limitations require paradigmatic shifts toward contextual understanding and intent recognition as core safety capabilities rather than post-hoc protective mechanisms.

* 22 pages and 23 figures

Via

Access Paper or Ask Questions

Attention in Motion: Secure Platooning via Transformer-based Misbehavior Detection

Dec 22, 2025

Konstantinos Kalogiannis, Ahmed Mohamed Hussain, Hexu Li, Panos Papadimitratos

Abstract:Vehicular platooning promises transformative improvements in transportation efficiency and safety through the coordination of multi-vehicle formations enabled by Vehicle-to-Everything (V2X) communication. However, the distributed nature of platoon coordination creates security vulnerabilities, allowing authenticated vehicles to inject falsified kinematic data, compromise operational stability, and pose a threat to passenger safety. Traditional misbehaviour detection approaches, which rely on plausibility checks and statistical methods, suffer from high False Positive (FP) rates and cannot capture the complex temporal dependencies inherent in multi-vehicle coordination dynamics. We present Attention In Motion (AIMformer), a transformer-based framework specifically tailored for real-time misbehaviour detection in vehicular platoons with edge deployment capabilities. AIMformer leverages multi-head self-attention mechanisms to simultaneously capture intra-vehicle temporal dynamics and inter-vehicle spatial correlations. It incorporates global positional encoding with vehicle-specific temporal offsets to handle join/exit maneuvers. We propose a Precision-Focused Binary Cross-Entropy (PFBCE) loss function that penalizes FPs to meet the requirements of safety-critical vehicular systems. Extensive evaluation across 4 platoon controllers, multiple attack vectors, and diverse mobility scenarios demonstrates superior performance ($\geq$ 0.93) compared to state-of-the-art baseline architectures. A comprehensive deployment analysis utilizing TensorFlow Lite (TFLite), Open Neural Network Exchange (ONNX), and TensorRT achieves sub-millisecond inference latency, making it suitable for real-time operation on resource-constrained edge platforms. Hence, validating AIMformer is viable for both in-vehicle and roadside infrastructure deployment.

* 16 pages and 10 figures

Via

Access Paper or Ask Questions

Jailbreaking Large Language Models Through Content Concretization

Sep 16, 2025

Johan Wahréus, Ahmed Hussain, Panos Papadimitratos

Abstract:Large Language Models (LLMs) are increasingly deployed for task automation and content generation, yet their safety mechanisms remain vulnerable to circumvention through different jailbreaking techniques. In this paper, we introduce \textit{Content Concretization} (CC), a novel jailbreaking technique that iteratively transforms abstract malicious requests into concrete, executable implementations. CC is a two-stage process: first, generating initial LLM responses using lower-tier, less constrained safety filters models, then refining them through higher-tier models that process both the preliminary output and original prompt. We evaluate our technique using 350 cybersecurity-specific prompts, demonstrating substantial improvements in jailbreak Success Rates (SRs), increasing from 7\% (no refinements) to 62\% after three refinement iterations, while maintaining a cost of 7.5\textcent~per prompt. Comparative A/B testing across nine different LLM evaluators confirms that outputs from additional refinement steps are consistently rated as more malicious and technically superior. Moreover, manual code analysis reveals that generated outputs execute with minimal modification, although optimal deployment typically requires target-specific fine-tuning. With eventual improved harmful code generation, these results highlight critical vulnerabilities in current LLM safety frameworks.

* Accepted for presentation in the Conference on Game Theory and AI for Security (GameSec) 2025

Via

Access Paper or Ask Questions

AttentionGuard: Transformer-based Misbehavior Detection for Secure Vehicular Platoons

May 15, 2025

Hexu Li, Konstantinos Kalogiannis, Ahmed Mohamed Hussain, Panos Papadimitratos

Figure 1 for AttentionGuard: Transformer-based Misbehavior Detection for Secure Vehicular Platoons

Figure 2 for AttentionGuard: Transformer-based Misbehavior Detection for Secure Vehicular Platoons

Figure 3 for AttentionGuard: Transformer-based Misbehavior Detection for Secure Vehicular Platoons

Figure 4 for AttentionGuard: Transformer-based Misbehavior Detection for Secure Vehicular Platoons

Abstract:Vehicle platooning, with vehicles traveling in close formation coordinated through Vehicle-to-Everything (V2X) communications, offers significant benefits in fuel efficiency and road utilization. However, it is vulnerable to sophisticated falsification attacks by authenticated insiders that can destabilize the formation and potentially cause catastrophic collisions. This paper addresses this challenge: misbehavior detection in vehicle platooning systems. We present AttentionGuard, a transformer-based framework for misbehavior detection that leverages the self-attention mechanism to identify anomalous patterns in mobility data. Our proposal employs a multi-head transformer-encoder to process sequential kinematic information, enabling effective differentiation between normal mobility patterns and falsification attacks across diverse platooning scenarios, including steady-state (no-maneuver) operation, join, and exit maneuvers. Our evaluation uses an extensive simulation dataset featuring various attack vectors (constant, gradual, and combined falsifications) and operational parameters (controller types, vehicle speeds, and attacker positions). Experimental results demonstrate that AttentionGuard achieves up to 0.95 F1-score in attack detection, with robust performance maintained during complex maneuvers. Notably, our system performs effectively with minimal latency (100ms decision intervals), making it suitable for real-time transportation safety applications. Comparative analysis reveals superior detection capabilities and establishes the transformer-encoder as a promising approach for securing Cooperative Intelligent Transport Systems (C-ITS) against sophisticated insider threats.

* Author's version; Accepted for presentation at the ACM Workshop on Wireless Security and Machine Learning (WiseML 2025)

Via

Access Paper or Ask Questions

Prompt, Divide, and Conquer: Bypassing Large Language Model Safety Filters via Segmented and Distributed Prompt Processing

Mar 27, 2025

Johan Wahréus, Ahmed Hussain, Panos Papadimitratos

Abstract:Large Language Models (LLMs) have transformed task automation and content generation across various domains while incorporating safety filters to prevent misuse. We introduce a novel jailbreaking framework that employs distributed prompt processing combined with iterative refinements to bypass these safety measures, particularly in generating malicious code. Our architecture consists of four key modules: prompt segmentation, parallel processing, response aggregation, and LLM-based jury evaluation. Tested on 500 malicious prompts across 10 cybersecurity categories, the framework achieves a 73.2% Success Rate (SR) in generating malicious code. Notably, our comparative analysis reveals that traditional single-LLM judge evaluation overestimates SRs (93.8%) compared to our LLM jury system (73.2%), with manual verification confirming that single-judge assessments often accept incomplete implementations. Moreover, we demonstrate that our distributed architecture improves SRs by 12% over the non-distributed approach in an ablation study, highlighting both the effectiveness of distributed prompt processing and the importance of robust evaluation methodologies in assessing jailbreak attempts.

* 22 pages; 26 figures

Via

Access Paper or Ask Questions

UnReference: analysis of the effect of spoofing on RTK reference stations for connected rovers

Mar 26, 2025

Marco Spanghero, Panos Papadimitratos

Abstract:Global Navigation Satellite Systems (GNSS) provide standalone precise navigation for a wide gamut of applications. Nevertheless, applications or systems such as unmanned vehicles (aerial or ground vehicles and surface vessels) generally require a much higher level of accuracy than those provided by standalone receivers. The most effective and economical way of achieving centimeter-level accuracy is to rely on corrections provided by fixed \emph{reference station} receivers to improve the satellite ranging measurements. Differential GNSS (DGNSS) and Real Time Kinematics (RTK) provide centimeter-level accuracy by distributing online correction streams to connected nearby mobile receivers typically termed \emph{rovers}. However, due to their static nature, reference stations are prime targets for GNSS attacks, both simplistic jamming and advanced spoofing, with different levels of adversarial control and complexity. Jamming the reference station would deny corrections and thus accuracy to the rovers. Spoofing the reference station would force it to distribute misleading corrections. As a result, all connected rovers using those corrections will be equally influenced by the adversary independently of their actual trajectory. We evaluate a battery of tests generated with an RF simulator to test the robustness of a common DGNSS/RTK processing library and receivers. We test both jamming and synchronized spoofing to demonstrate that adversarial action on the rover using reference spoofing is both effective and convenient from an adversarial perspective. Additionally, we discuss possible strategies based on existing countermeasures (self-validation of the PNT solution and monitoring of own clock drift) that the rover and the reference station can adopt to avoid using or distributing bogus corrections.

* To appear the the 2025 IEEE/ION Position, Navigation and Localization Symposium

Via

Access Paper or Ask Questions

GNSS jammer localization and identification with airborne commercial GNSS receivers

Mar 26, 2025

Marco Spanghero, Filip Geib, Ronny Panier, Panos Papadimitratos

Abstract:Global Navigation Satellite Systems (GNSS) are fundamental in ubiquitously providing position and time to a wide gamut of systems. Jamming remains a realistic threat in many deployment settings, civilian and tactical. Specifically, in Unmanned Aerial Vehicles (UAVs) sustained denial raises safety critical concerns. This work presents a strategy that allows detection, localization, and classification both in the frequency and time domain of interference signals harmful to navigation. A high-performance Vertical Take Off and Landing (VTOL) UAV with a single antenna and a commercial GNSS receiver is used to geolocate and characterize RF emitters at long range, to infer the navigation impairment. Raw IQ baseband snapshots from the GNSS receiver make the application of spectral correlation methods possible without extra software-defined radio payload, paving the way to spectrum identification and monitoring in airborne platforms, aiming at RF situational awareness. Live testing at Jammertest, in Norway, with portable, commercially available GNSS multi-band jammers demonstrates the ability to detect, localize, and characterize harmful interference. Our system pinpointed the position with an error of a few meters of the transmitter and the extent of the affected area at long range, without entering the denied zone. Additionally, further spectral content extraction is used to accurately identify the jammer frequency, bandwidth, and modulation scheme based on spectral correlation techniques.

* IEEE Transactions on Information Forensics and Security (2025)

Via

Access Paper or Ask Questions

LLMs Have Rhythm: Fingerprinting Large Language Models Using Inter-Token Times and Network Traffic Analysis

Feb 27, 2025

Saeif Alhazbi, Ahmed Mohamed Hussain, Gabriele Oligeri, Panos Papadimitratos

Abstract:As Large Language Models (LLMs) become increasingly integrated into many technological ecosystems across various domains and industries, identifying which model is deployed or being interacted with is critical for the security and trustworthiness of the systems. Current verification methods typically rely on analyzing the generated output to determine the source model. However, these techniques are susceptible to adversarial attacks, operate in a post-hoc manner, and may require access to the model weights to inject a verifiable fingerprint. In this paper, we propose a novel passive and non-invasive fingerprinting technique that operates in real-time and remains effective even under encrypted network traffic conditions. Our method leverages the intrinsic autoregressive generation nature of language models, which generate text one token at a time based on all previously generated tokens, creating a unique temporal pattern like a rhythm or heartbeat that persists even when the output is streamed over a network. We find that measuring the Inter-Token Times (ITTs)-time intervals between consecutive tokens-can identify different language models with high accuracy. We develop a Deep Learning (DL) pipeline to capture these timing patterns using network traffic analysis and evaluate it on 16 Small Language Models (SLMs) and 10 proprietary LLMs across different deployment scenarios, including local host machine (GPU/CPU), Local Area Network (LAN), Remote Network, and Virtual Private Network (VPN). The experimental results confirm that our proposed technique is effective and maintains high accuracy even when tested in different network conditions. This work opens a new avenue for model identification in real-world scenarios and contributes to more secure and trustworthy language model deployment.

Via

Access Paper or Ask Questions

CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models

Jan 02, 2025

Johan Wahréus, Ahmed Mohamed Hussain, Panos Papadimitratos

Figure 1 for CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models

Figure 2 for CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models

Figure 3 for CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models

Figure 4 for CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models

Abstract:Numerous studies have investigated methods for jailbreaking Large Language Models (LLMs) to generate harmful content. Typically, these methods are evaluated using datasets of malicious prompts designed to bypass security policies established by LLM providers. However, the generally broad scope and open-ended nature of existing datasets can complicate the assessment of jailbreaking effectiveness, particularly in specific domains, notably cybersecurity. To address this issue, we present and publicly release CySecBench, a comprehensive dataset containing 12662 prompts specifically designed to evaluate jailbreaking techniques in the cybersecurity domain. The dataset is organized into 10 distinct attack-type categories, featuring close-ended prompts to enable a more consistent and accurate assessment of jailbreaking attempts. Furthermore, we detail our methodology for dataset generation and filtration, which can be adapted to create similar datasets in other domains. To demonstrate the utility of CySecBench, we propose and evaluate a jailbreaking approach based on prompt obfuscation. Our experimental results show that this method successfully elicits harmful content from commercial black-box LLMs, achieving Success Rates (SRs) of 65% with ChatGPT and 88% with Gemini; in contrast, Claude demonstrated greater resilience with a jailbreaking SR of 17%. Compared to existing benchmark approaches, our method shows superior performance, highlighting the value of domain-specific evaluation datasets for assessing LLM security measures. Moreover, when evaluated using prompts from a widely used dataset (i.e., AdvBench), it achieved an SR of 78.5%, higher than the state-of-the-art methods.

Via

Access Paper or Ask Questions

Edge AI-based Radio Frequency Fingerprinting for IoT Networks

Dec 13, 2024

Ahmed Mohamed Hussain, Nada Abughanam, Panos Papadimitratos

Figure 1 for Edge AI-based Radio Frequency Fingerprinting for IoT Networks

Figure 2 for Edge AI-based Radio Frequency Fingerprinting for IoT Networks

Figure 3 for Edge AI-based Radio Frequency Fingerprinting for IoT Networks

Figure 4 for Edge AI-based Radio Frequency Fingerprinting for IoT Networks

Abstract:The deployment of the Internet of Things (IoT) in smart cities and critical infrastructure has enhanced connectivity and real-time data exchange but introduced significant security challenges. While effective, cryptography can often be resource-intensive for small-footprint resource-constrained (i.e., IoT) devices. Radio Frequency Fingerprinting (RFF) offers a promising authentication alternative by using unique RF signal characteristics for device identification at the Physical (PHY)-layer, without resorting to cryptographic solutions. The challenge is two-fold: how to deploy such RFF in a large scale and for resource-constrained environments. Edge computing, processing data closer to its source, i.e., the wireless device, enables faster decision-making, reducing reliance on centralized cloud servers. Considering a modest edge device, we introduce two truly lightweight Edge AI-based RFF schemes tailored for resource-constrained devices. We implement two Deep Learning models, namely a Convolution Neural Network and a Transformer-Encoder, to extract complex features from the IQ samples, forming device-specific RF fingerprints. We convert the models to TensorFlow Lite and evaluate them on a Raspberry Pi, demonstrating the practicality of Edge deployment. Evaluations demonstrate the Transformer-Encoder outperforms the CNN in identifying unique transmitter features, achieving high accuracy (> 0.95) and ROC-AUC scores (> 0.90) while maintaining a compact model size of 73KB, appropriate for resource-constrained devices.

* 11 pages, and 8 figures

Via

Access Paper or Ask Questions