Abstract:The integration of large language models (LLMs) into cyber security applications presents significant opportunities, such as enhancing threat analysis and malware detection, but can also introduce critical risks and safety concerns, including personal data leakage and automated generation of new malware. To address these challenges, we developed CyberLLMInstruct, a dataset of 54,928 instruction-response pairs spanning cyber security tasks such as malware analysis, phishing simulations, and zero-day vulnerabilities. The dataset was constructed through a multi-stage process. This involved sourcing data from multiple resources, filtering and structuring it into instruction-response pairs, and aligning it with real-world scenarios to enhance its applicability. Seven open-source LLMs were chosen to test the usefulness of CyberLLMInstruct: Phi 3 Mini 3.8B, Mistral 7B, Qwen 2.5 7B, Llama 3 8B, Llama 3.1 8B, Gemma 2 9B, and Llama 2 70B. In our primary example, we rigorously assess the safety of fine-tuned models using the OWASP top 10 framework, finding that fine-tuning reduces safety resilience across all tested LLMs and every adversarial attack (e.g., the security score of Llama 3.1 8B against prompt injection drops from 0.95 to 0.15). In our second example, we show that these same fine-tuned models can also achieve up to 92.50 percent accuracy on the CyberMetric benchmark. These findings highlight a trade-off between performance and safety, showing the importance of adversarial testing and further research into fine-tuning methodologies that can mitigate safety risks while still improving performance across diverse datasets and domains. All scripts required to reproduce the dataset, along with examples and relevant resources for replicating our results, will be made available upon the paper's acceptance.
Abstract:Federated Learning (FL) in the Internet of Things (IoT) environments can enhance machine learning by utilising decentralised data, but at the same time, it might introduce significant privacy and security concerns due to the constrained nature of IoT devices. This represents a research challenge that we aim to address in this paper. We systematically analysed recent literature to identify privacy threats in FL within IoT environments, and evaluate the defensive measures that can be employed to mitigate these threats. Using a Systematic Literature Review (SLR) approach, we searched five publication databases (Scopus, IEEE Xplore, Wiley, ACM, and Science Direct), collating relevant papers published between 2017 and April 2024, a period which spans from the introduction of FL until now. Guided by the PRISMA protocol, we selected 49 papers to focus our systematic review on. We analysed these papers, paying special attention to the privacy threats and defensive measures -- specifically within the context of IoT -- using inclusion and exclusion criteria tailored to highlight recent advances and critical insights. We identified various privacy threats, including inference attacks, poisoning attacks, and eavesdropping, along with defensive measures such as Differential Privacy and Secure Multi-Party Computation. These defences were evaluated for their effectiveness in protecting privacy without compromising the functional integrity of FL in IoT settings. Our review underscores the necessity for robust and efficient privacy-preserving strategies tailored for IoT environments. Notably, there is a need for strategies against replay, evasion, and model stealing attacks. Exploring lightweight defensive measures and emerging technologies such as blockchain may help improve the privacy of FL in IoT, leading to the creation of FL models that can operate under variable network conditions.