Abstract:Large Language Models (LLMs) have transformed task automation and content generation across various domains while incorporating safety filters to prevent misuse. We introduce a novel jailbreaking framework that employs distributed prompt processing combined with iterative refinements to bypass these safety measures, particularly in generating malicious code. Our architecture consists of four key modules: prompt segmentation, parallel processing, response aggregation, and LLM-based jury evaluation. Tested on 500 malicious prompts across 10 cybersecurity categories, the framework achieves a 73.2% Success Rate (SR) in generating malicious code. Notably, our comparative analysis reveals that traditional single-LLM judge evaluation overestimates SRs (93.8%) compared to our LLM jury system (73.2%), with manual verification confirming that single-judge assessments often accept incomplete implementations. Moreover, we demonstrate that our distributed architecture improves SRs by 12% over the non-distributed approach in an ablation study, highlighting both the effectiveness of distributed prompt processing and the importance of robust evaluation methodologies in assessing jailbreak attempts.
Abstract:Global Navigation Satellite Systems (GNSS) are fundamental in ubiquitously providing position and time to a wide gamut of systems. Jamming remains a realistic threat in many deployment settings, civilian and tactical. Specifically, in Unmanned Aerial Vehicles (UAVs) sustained denial raises safety critical concerns. This work presents a strategy that allows detection, localization, and classification both in the frequency and time domain of interference signals harmful to navigation. A high-performance Vertical Take Off and Landing (VTOL) UAV with a single antenna and a commercial GNSS receiver is used to geolocate and characterize RF emitters at long range, to infer the navigation impairment. Raw IQ baseband snapshots from the GNSS receiver make the application of spectral correlation methods possible without extra software-defined radio payload, paving the way to spectrum identification and monitoring in airborne platforms, aiming at RF situational awareness. Live testing at Jammertest, in Norway, with portable, commercially available GNSS multi-band jammers demonstrates the ability to detect, localize, and characterize harmful interference. Our system pinpointed the position with an error of a few meters of the transmitter and the extent of the affected area at long range, without entering the denied zone. Additionally, further spectral content extraction is used to accurately identify the jammer frequency, bandwidth, and modulation scheme based on spectral correlation techniques.
Abstract:Global Navigation Satellite Systems (GNSS) provide standalone precise navigation for a wide gamut of applications. Nevertheless, applications or systems such as unmanned vehicles (aerial or ground vehicles and surface vessels) generally require a much higher level of accuracy than those provided by standalone receivers. The most effective and economical way of achieving centimeter-level accuracy is to rely on corrections provided by fixed \emph{reference station} receivers to improve the satellite ranging measurements. Differential GNSS (DGNSS) and Real Time Kinematics (RTK) provide centimeter-level accuracy by distributing online correction streams to connected nearby mobile receivers typically termed \emph{rovers}. However, due to their static nature, reference stations are prime targets for GNSS attacks, both simplistic jamming and advanced spoofing, with different levels of adversarial control and complexity. Jamming the reference station would deny corrections and thus accuracy to the rovers. Spoofing the reference station would force it to distribute misleading corrections. As a result, all connected rovers using those corrections will be equally influenced by the adversary independently of their actual trajectory. We evaluate a battery of tests generated with an RF simulator to test the robustness of a common DGNSS/RTK processing library and receivers. We test both jamming and synchronized spoofing to demonstrate that adversarial action on the rover using reference spoofing is both effective and convenient from an adversarial perspective. Additionally, we discuss possible strategies based on existing countermeasures (self-validation of the PNT solution and monitoring of own clock drift) that the rover and the reference station can adopt to avoid using or distributing bogus corrections.
Abstract:As Large Language Models (LLMs) become increasingly integrated into many technological ecosystems across various domains and industries, identifying which model is deployed or being interacted with is critical for the security and trustworthiness of the systems. Current verification methods typically rely on analyzing the generated output to determine the source model. However, these techniques are susceptible to adversarial attacks, operate in a post-hoc manner, and may require access to the model weights to inject a verifiable fingerprint. In this paper, we propose a novel passive and non-invasive fingerprinting technique that operates in real-time and remains effective even under encrypted network traffic conditions. Our method leverages the intrinsic autoregressive generation nature of language models, which generate text one token at a time based on all previously generated tokens, creating a unique temporal pattern like a rhythm or heartbeat that persists even when the output is streamed over a network. We find that measuring the Inter-Token Times (ITTs)-time intervals between consecutive tokens-can identify different language models with high accuracy. We develop a Deep Learning (DL) pipeline to capture these timing patterns using network traffic analysis and evaluate it on 16 Small Language Models (SLMs) and 10 proprietary LLMs across different deployment scenarios, including local host machine (GPU/CPU), Local Area Network (LAN), Remote Network, and Virtual Private Network (VPN). The experimental results confirm that our proposed technique is effective and maintains high accuracy even when tested in different network conditions. This work opens a new avenue for model identification in real-world scenarios and contributes to more secure and trustworthy language model deployment.
Abstract:Numerous studies have investigated methods for jailbreaking Large Language Models (LLMs) to generate harmful content. Typically, these methods are evaluated using datasets of malicious prompts designed to bypass security policies established by LLM providers. However, the generally broad scope and open-ended nature of existing datasets can complicate the assessment of jailbreaking effectiveness, particularly in specific domains, notably cybersecurity. To address this issue, we present and publicly release CySecBench, a comprehensive dataset containing 12662 prompts specifically designed to evaluate jailbreaking techniques in the cybersecurity domain. The dataset is organized into 10 distinct attack-type categories, featuring close-ended prompts to enable a more consistent and accurate assessment of jailbreaking attempts. Furthermore, we detail our methodology for dataset generation and filtration, which can be adapted to create similar datasets in other domains. To demonstrate the utility of CySecBench, we propose and evaluate a jailbreaking approach based on prompt obfuscation. Our experimental results show that this method successfully elicits harmful content from commercial black-box LLMs, achieving Success Rates (SRs) of 65% with ChatGPT and 88% with Gemini; in contrast, Claude demonstrated greater resilience with a jailbreaking SR of 17%. Compared to existing benchmark approaches, our method shows superior performance, highlighting the value of domain-specific evaluation datasets for assessing LLM security measures. Moreover, when evaluated using prompts from a widely used dataset (i.e., AdvBench), it achieved an SR of 78.5%, higher than the state-of-the-art methods.
Abstract:The deployment of the Internet of Things (IoT) in smart cities and critical infrastructure has enhanced connectivity and real-time data exchange but introduced significant security challenges. While effective, cryptography can often be resource-intensive for small-footprint resource-constrained (i.e., IoT) devices. Radio Frequency Fingerprinting (RFF) offers a promising authentication alternative by using unique RF signal characteristics for device identification at the Physical (PHY)-layer, without resorting to cryptographic solutions. The challenge is two-fold: how to deploy such RFF in a large scale and for resource-constrained environments. Edge computing, processing data closer to its source, i.e., the wireless device, enables faster decision-making, reducing reliance on centralized cloud servers. Considering a modest edge device, we introduce two truly lightweight Edge AI-based RFF schemes tailored for resource-constrained devices. We implement two Deep Learning models, namely a Convolution Neural Network and a Transformer-Encoder, to extract complex features from the IQ samples, forming device-specific RF fingerprints. We convert the models to TensorFlow Lite and evaluate them on a Raspberry Pi, demonstrating the practicality of Edge deployment. Evaluations demonstrate the Transformer-Encoder outperforms the CNN in identifying unique transmitter features, achieving high accuracy (> 0.95) and ROC-AUC scores (> 0.90) while maintaining a compact model size of 73KB, appropriate for resource-constrained devices.
Abstract:GNSS are indispensable for various applications, but they are vulnerable to spoofing attacks. The original receiver autonomous integrity monitoring (RAIM) was not designed for securing GNSS. In this context, RAIM was extended with wireless signals, termed signals of opportunity (SOPs), or onboard sensors, typically assumed benign. However, attackers might also manipulate wireless networks, raising the need for a solution that considers untrustworthy SOPs. To address this, we extend RAIM by incorporating all opportunistic information, i.e., measurements from terrestrial infrastructures and onboard sensors, culminating in one function for robust GNSS spoofing detection. The objective is to assess the likelihood of GNSS spoofing by analyzing locations derived from extended RAIM solutions, which include location solutions from GNSS pseudorange subsets and wireless signal subsets of untrusted networks. Our method comprises two pivotal components: subset generation and location fusion. Subsets of ranging information are created and processed through positioning algorithms, producing temporary locations. Onboard sensors provide speed, acceleration, and attitude data, aiding in location filtering based on motion constraints. The filtered locations, modeled with uncertainty, are fused into a composite likelihood function normalized for GNSS spoofing detection. Theoretical assessments of GNSS-only and multi-infrastructure scenarios under uncoordinated and coordinated attacks are conducted. The detection of these attacks is feasible when the number of benign subsets exceeds a specific threshold. A real-world dataset from the Kista area is used for experimental validation. Comparative analysis against baseline methods shows a significant improvement in detection accuracy achieved by our Gaussian Mixture RAIM approach. Moreover, we discuss leveraging RAIM results for plausible location recovery.
Abstract:Radio Frequency Fingerprinting (RFF) techniques promise to authenticate wireless devices at the physical layer based on inherent hardware imperfections introduced during manufacturing. Such RF transmitter imperfections are reflected into over-the-air signals, allowing receivers to accurately identify the RF transmitting source. Recent advances in Machine Learning, particularly in Deep Learning (DL), have improved the ability of RFF systems to extract and learn complex features that make up the device-specific fingerprint. However, integrating DL techniques with RFF and operating the system in real-world scenarios presents numerous challenges. This article identifies and analyzes these challenges while considering the three reference phases of any DL-based RFF system: (i) data collection and preprocessing, (ii) training, and finally, (iii) deployment. Our investigation points out the current open problems that prevent real deployment of RFF while discussing promising future directions, thus paving the way for further research in the area.
Abstract:Global Navigation Satellite Systems (GNSS) are integrated into many devices. However, civilian GNSS signals are usually not cryptographically protected. This makes attacks that forge signals relatively easy. Considering modern devices often have network connections and onboard sensors, the proposed here Probabilistic Detection of GNSS Spoofing (PDS) scheme is based on such opportunistic information. PDS has at its core two parts. First, a regression problem with motion model constraints, which equalizes the noise of all locations considering the motion model of the device. Second, a Gaussian process, that analyzes statistical properties of location data to construct uncertainty. Then, a likelihood function, that fuses the two parts, as a basis for a Neyman-Pearson lemma (NPL)-based detection strategy. Our experimental evaluation shows a performance gain over the state-of-the-art, in terms of attack detection effectiveness.
Abstract:It is well known that GNSS receivers are vulnerable to jamming and spoofing attacks, and numerous such incidents have been reported in the last decade all over the world. The notion of participatory sensing, or crowdsensing, is that a large ensemble of voluntary contributors provides measurements, rather than relying on a dedicated sensing infrastructure. The participatory sensing network under consideration in this work is based on GNSS receivers embedded in, for example, mobile phones. The provided measurements refer to the receiver-reported carrier-to-noise-density ratio ($C/N_0$) estimates or automatic gain control (AGC) values. In this work, we exploit $C/N_0$ measurements to locate a GNSS jammer, using multiple receivers in a crowdsourcing manner. We extend a previous jammer position estimator by only including data that is received during parts of the sensing period where jamming is detected by the sensor. In addition, we perform hardware testing for verification and evaluation of the proposed and compared state-of-the-art algorithms. Evaluations are performed using a Samsung S20+ mobile phone as participatory sensor and a Spirent GSS9000 GNSS simulator to generate GNSS and jamming signals. The proposed algorithm is shown to work well when using $C/N_0$ measurements and outperform the alternative algorithms in the evaluated scenarios, producing a median error of 50 meters when the pathloss exponent is 2. With higher pathloss exponents the error gets higher. The AGC output from the phone was too noisy and needs further processing to be useful for position estimation.