Abstract:Blockchain-based Federated Learning (FL) is an emerging decentralized machine learning paradigm that enables model training without relying on a central server. Although some BFL frameworks are considered privacy-preserving, they are still vulnerable to various attacks, including inference and model poisoning. Additionally, most of these solutions employ strong trust assumptions among all participating entities or introduce incentive mechanisms to encourage collaboration, making them susceptible to multiple security flaws. This work presents VerifBFL, a trustless, privacy-preserving, and verifiable federated learning framework that integrates blockchain technology and cryptographic protocols. By employing zero-knowledge Succinct Non-Interactive Argument of Knowledge (zk-SNARKs) and incrementally verifiable computation (IVC), VerifBFL ensures the verifiability of both local training and aggregation processes. The proofs of training and aggregation are verified on-chain, guaranteeing the integrity and auditability of each participant's contributions. To protect training data from inference attacks, VerifBFL leverages differential privacy. Finally, to demonstrate the efficiency of the proposed protocols, we built a proof of concept using emerging tools. The results show that generating proofs for local training and aggregation in VerifBFL takes less than 81s and 2s, respectively, while verifying them on-chain takes less than 0.6s.
Abstract:Blockchained federated learning (BFL) combines the concepts of federated learning and blockchain technology to enhance privacy, security, and transparency in collaborative machine learning models. However, implementing BFL frameworks poses challenges in terms of scalability and cost-effectiveness. Reputation-aware BFL poses even more challenges, as blockchain validators are tasked with processing federated learning transactions along with the transactions that evaluate FL tasks and aggregate reputations. This leads to faster blockchain congestion and performance degradation. To improve BFL efficiency while increasing scalability and reducing on-chain reputation management costs, this paper proposes AutoDFL, a scalable and automated reputation-aware decentralized federated learning framework. AutoDFL leverages zk-Rollups as a Layer-2 scaling solution to boost the performance while maintaining the same level of security as the underlying Layer-1 blockchain. Moreover, AutoDFL introduces an automated and fair reputation model designed to incentivize federated learning actors. We develop a proof of concept for our framework for an accurate evaluation. Tested with various custom workloads, AutoDFL reaches an average throughput of over 3000 TPS with a gas reduction of up to 20X.
Abstract:The integration of Internet of Things (IoT) technology in various domains has led to operational advancements, but it has also introduced new vulnerabilities to cybersecurity threats, as evidenced by recent widespread cyberattacks on IoT devices. Intrusion detection systems are often reactive, triggered by specific patterns or anomalies observed within the network. To address this challenge, this work proposes a proactive approach to anticipate and preemptively mitigate malicious activities, aiming to prevent potential damage before it occurs. This paper proposes an innovative intrusion prediction framework empowered by Pre-trained Large Language Models (LLMs). The framework incorporates two LLMs: a fine-tuned Bidirectional and AutoRegressive Transformers (BART) model for predicting network traffic and a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model for evaluating the predicted traffic. By harnessing the bidirectional capabilities of BART the framework then identifies malicious packets among these predictions. Evaluated using the CICIoT2023 IoT attack dataset, our framework showcases a notable enhancement in predictive performance, attaining an impressive 98% overall accuracy, providing a powerful response to the cybersecurity challenges that confront IoT networks.
Abstract:In recent years, numerous large-scale cyberattacks have exploited Internet of Things (IoT) devices, a phenomenon that is expected to escalate with the continuing proliferation of IoT technology. Despite considerable efforts in attack detection, intrusion detection systems remain mostly reactive, responding to specific patterns or observed anomalies. This work proposes a proactive approach to anticipate and mitigate malicious activities before they cause damage. This paper proposes a novel network intrusion prediction framework that combines Large Language Models (LLMs) with Long Short Term Memory (LSTM) networks. The framework incorporates two LLMs in a feedback loop: a fine-tuned Generative Pre-trained Transformer (GPT) model for predicting network traffic and a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) for evaluating the predicted traffic. The LSTM classifier model then identifies malicious packets among these predictions. Our framework, evaluated on the CICIoT2023 IoT attack dataset, demonstrates a significant improvement in predictive capabilities, achieving an overall accuracy of 98%, offering a robust solution to IoT cybersecurity challenges.
Abstract:In the rapidly evolving landscape of cyber threats targeting the Internet of Things (IoT) ecosystem, and in light of the surge in botnet-driven Distributed Denial of Service (DDoS) and brute force attacks, this study focuses on the early detection of IoT bots. It specifically addresses the detection of stealth bot communication that precedes and orchestrates attacks. This study proposes a comprehensive methodology for analyzing IoT network traffic, including considerations for both unidirectional and bidirectional flow, as well as packet formats. It explores a wide spectrum of network features critical for representing network traffic and characterizing benign IoT traffic patterns effectively. Moreover, it delves into the modeling of traffic using various semi-supervised learning techniques. Through extensive experimentation with the IoT-23 dataset - a comprehensive collection featuring diverse botnet types and traffic scenarios - we have demonstrated the feasibility of detecting botnet traffic corresponding to different operations and types of bots, specifically focusing on stealth command and control (C2) communications. The results obtained have demonstrated the feasibility of identifying C2 communication with a 100% success rate through packet-based methods and 94% via flow based approaches, with a false positive rate of 1.53%.
Abstract:The introduction of 6G technology into the Internet of Vehicles (IoV) promises to revolutionize connectivity with ultra-high data rates and seamless network coverage. However, this technological leap also brings significant challenges, particularly for the dynamic and diverse IoV landscape, which must meet the rigorous reliability and security requirements of 6G networks. Furthermore, integrating 6G will likely increase the IoV's susceptibility to a spectrum of emerging cyber threats. Therefore, it is crucial for security mechanisms to dynamically adapt and learn new attack patterns, keeping pace with the rapid evolution and diversification of these threats - a capability currently lacking in existing systems. This paper presents a novel intrusion detection system leveraging the paradigm of life-long (or continual) learning. Our methodology combines class-incremental learning with federated learning, an approach ideally suited to the distributed nature of the IoV. This strategy effectively harnesses the collective intelligence of Connected and Automated Vehicles (CAVs) and edge computing capabilities to train the detection system. To the best of our knowledge, this study is the first to synergize class-incremental learning with federated learning specifically for cyber attack detection. Through comprehensive experiments on a recent network traffic dataset, our system has exhibited a robust adaptability in learning new cyber attack patterns, while effectively retaining knowledge of previously encountered ones. Additionally, it has proven to maintain high accuracy and a low false positive rate.
Abstract:The Internet of Vehicles (IoV) is a crucial technology for Intelligent Transportation Systems (ITS) that integrates vehicles with the Internet and other entities. The emergence of 5G and the forthcoming 6G networks presents an enormous potential to transform the IoV by enabling ultra-reliable, low-latency, and high-bandwidth communications. Nevertheless, as connectivity expands, cybersecurity threats have become a significant concern. The issue has been further exacerbated by the rising number of zero-day (0-day) attacks, which can exploit unknown vulnerabilities and bypass existing Intrusion Detection Systems (IDSs). In this paper, we propose Zero-X, an innovative security framework that effectively detects both 0-day and N-day attacks. The framework achieves this by combining deep neural networks with Open-Set Recognition (OSR). Our approach introduces a novel scheme that uses blockchain technology to facilitate trusted and decentralized federated learning (FL) of the ZeroX framework. This scheme also prioritizes privacy preservation, enabling both CAVs and Security Operation Centers (SOCs) to contribute their unique knowledge while protecting the privacy of their sensitive data. To the best of our knowledge, this is the first work to leverage OSR in combination with privacy-preserving FL to identify both 0-day and N-day attacks in the realm of IoV. The in-depth experiments on two recent network traffic datasets show that the proposed framework achieved a high detection rate while minimizing the false positive rate. Comparison with related work showed that the Zero-X framework outperforms existing solutions.
Abstract:Deploying Connected and Automated Vehicles (CAVs) on top of 5G and Beyond networks (5GB) makes them vulnerable to increasing vectors of security and privacy attacks. In this context, a wide range of advanced machine/deep learning based solutions have been designed to accurately detect security attacks. Specifically, supervised learning techniques have been widely applied to train attack detection models. However, the main limitation of such solutions is their inability to detect attacks different from those seen during the training phase, or new attacks, also called zero-day attacks. Moreover, training the detection model requires significant data collection and labeling, which increases the communication overhead, and raises privacy concerns. To address the aforementioned limits, we propose in this paper a novel detection mechanism that leverages the ability of the deep auto-encoder method to detect attacks relying only on the benign network traffic pattern. Using federated learning, the proposed intrusion detection system can be trained with large and diverse benign network traffic, while preserving the CAVs privacy, and minimizing the communication overhead. The in-depth experiment on a recent network traffic dataset shows that the proposed system achieved a high detection rate while minimizing the false positive rate, and the detection delay.
Abstract:The Advanced Metering Infrastructure (AMI) is one of the key components of the smart grid. It provides interactive services for managing billing and electricity consumption, but it also introduces new vectors for cyberattacks. Although, the devastating and severe impact of power overloading cyberattacks on smart grid AMI, few researches in the literature have addressed them. In the present paper, we propose a two-level anomaly detection framework based on regression decision trees. The introduced detection approach leverages the regularity and predictability of energy consumption to build reference consumption patterns for the whole neighborhood and each household within it. Using a reference consumption pattern enables detecting power overloading cyberattacks regardless of the attacker's strategy as they cause a drastic change in the consumption pattern. The continuous two-level monitoring of energy consumption load allows efficient and early detection of cyberattacks. We carried out an extensive experiment on a real-world publicly available energy consumption dataset of 500 customers in Ireland. We extracted, from the raw data, the relevant attributes for training the energy consumption patterns. The evaluation shows that our approach achieves a high detection rate, a low false alarm rate, and superior performances compared to existing solutions.