Abstract:The flexibility and complexity of IPv6 extension headers allow attackers to create covert channels or bypass security mechanisms, leading to potential data breaches or system compromises. The mature development of machine learning has become the primary detection technology option used to mitigate covert communication threats. However, the complexity of detecting covert communication, evolving injection techniques, and scarcity of data make building machine-learning models challenging. In previous related research, machine learning has shown good performance in detecting covert communications, but oversimplified attack scenario assumptions cannot represent the complexity of modern covert technologies and make it easier for machine learning models to detect covert communications. To bridge this gap, in this study, we analyzed the packet structure and network traffic behavior of IPv6, used encryption algorithms, and performed covert communication injection without changing network packet behavior to get closer to real attack scenarios. In addition to analyzing and injecting methods for covert communications, this study also uses comprehensive machine learning techniques to train the model proposed in this study to detect threats, including traditional decision trees such as random forests and gradient boosting, as well as complex neural network architectures such as CNNs and LSTMs, to achieve detection accuracy of over 90\%. This study details the methods used for dataset augmentation and the comparative performance of the applied models, reinforcing insights into the adaptability and resilience of the machine learning application in IPv6 covert communication. In addition, we also proposed a Generative AI-assisted interpretation concept based on prompt engineering as a preliminary study of the role of Generative AI agents in covert communication.
Abstract:Deploying machine learning (ML) in dynamic data-driven applications systems (DDDAS) can improve the security of industrial control systems (ICS). However, ML-based DDDAS are vulnerable to adversarial attacks because adversaries can alter the input data slightly so that the ML models predict a different result. In this paper, our goal is to build a resilient edge machine learning (reML) architecture that is designed to withstand adversarial attacks by performing Data Air Gap Transformation (DAGT) to anonymize data feature spaces using deep neural networks and randomize the ML models used for predictions. The reML is based on the Resilient DDDAS paradigm, Moving Target Defense (MTD) theory, and TinyML and is applied to combat adversarial attacks on ICS. Furthermore, the proposed approach is power-efficient and privacy-preserving and, therefore, can be deployed on power-constrained devices to enhance ICS security. This approach enables resilient ML inference at the edge by shifting the computation from the computing-intensive platforms to the resource-constrained edge devices. The incorporation of TinyML with TensorFlow Lite ensures efficient resource utilization and, consequently, makes reML suitable for deployment in various industrial control environments. Furthermore, the dynamic nature of reML, facilitated by the resilient DDDAS development environment, allows for continuous adaptation and improvement in response to emerging threats. Lastly, we evaluate our approach on an ICS dataset and demonstrate that reML provides a viable and effective solution for resilient ML inference at the edge devices.
Abstract:Autonomous Vehicles (AVs) heavily rely on sensors and communication networks like Global Positioning System (GPS) to navigate autonomously. Prior research has indicated that networks like GPS are vulnerable to cyber-attacks such as spoofing and jamming, thus posing serious risks like navigation errors and system failures. These threats are expected to intensify with the widespread deployment of AVs, making it crucial to detect and mitigate such attacks. This paper proposes GPS Intrusion Detection System, or GPS-IDS, an Anomaly Behavior Analysis (ABA)-based intrusion detection framework to detect GPS spoofing attacks on AVs. The framework uses a novel physics-based vehicle behavior model where a GPS navigation model is integrated into the conventional dynamic bicycle model for accurate AV behavior representation. Temporal features derived from this behavior model are analyzed using machine learning to detect normal and abnormal navigation behavior. The performance of the GPS-IDS framework is evaluated on the AV-GPS-Dataset - a real-world dataset collected by the team using an AV testbed. The dataset has been publicly released for the global research community. To the best of our knowledge, this dataset is the first of its kind and will serve as a useful resource to address such security challenges.
Abstract:As a rapidly growing cyber-physical platform, Autonomous Vehicles (AVs) are encountering more security challenges as their capabilities continue to expand. In recent years, adversaries are actively targeting the perception sensors of autonomous vehicles with sophisticated attacks that are not easily detected by the vehicles' control systems. This work proposes an Anomaly Behavior Analysis approach to detect a perception sensor attack against an autonomous vehicle. The framework relies on temporal features extracted from a physics-based autonomous vehicle behavior model to capture the normal behavior of vehicular perception in autonomous driving. By employing a combination of model-based techniques and machine learning algorithms, the proposed framework distinguishes between normal and abnormal vehicular perception behavior. To demonstrate the application of the framework in practice, we performed a depth camera attack experiment on an autonomous vehicle testbed and generated an extensive dataset. We validated the effectiveness of the proposed framework using this real-world data and released the dataset for public access. To our knowledge, this dataset is the first of its kind and will serve as a valuable resource for the research community in evaluating their intrusion detection techniques effectively.
Abstract:Large-scale transformer-based models like the Bidirectional Encoder Representations from Transformers (BERT) are widely used for Natural Language Processing (NLP) applications, wherein these models are initially pre-trained with a large corpus with millions of parameters and then fine-tuned for a downstream NLP task. One of the major limitations of these large-scale models is that they cannot be deployed on resource-constrained devices due to their large model size and increased inference latency. In order to overcome these limitations, such large-scale models can be converted to an optimized FlatBuffer format, tailored for deployment on resource-constrained edge devices. Herein, we evaluate the performance of such FlatBuffer transformed MobileBERT models on three different edge devices, fine-tuned for Reputation analysis of English language tweets in the RepLab 2013 dataset. In addition, this study encompassed an evaluation of the deployed models, wherein their latency, performance, and resource efficiency were meticulously assessed. Our experiment results show that, compared to the original BERT large model, the converted and quantized MobileBERT models have 160$\times$ smaller footprints for a 4.1% drop in accuracy while analyzing at least one tweet per second on edge devices. Furthermore, our study highlights the privacy-preserving aspect of TinyML systems as all data is processed locally within a serverless environment.
Abstract:Social media has become an essential part of the modern lifestyle, with its usage being highly prevalent. This has resulted in unprecedented amounts of data generated from users in social media, such as users' attitudes, opinions, interests, purchases, and activities across various aspects of their lives. Therefore, in a world of social media, where its power has shifted to users, actions taken by companies and public figures are subject to constantly being under scrutiny by influential global audiences. As a result, reputation management in social media has become essential as companies and public figures need to maintain their reputation to preserve their reputation capital. However, domain experts still face the challenge of lacking appropriate solutions to automate reliable online reputation analysis. To tackle this challenge, we proposed a novel reputation analysis approach based on the popular language model BERT (Bidirectional Encoder Representations from Transformers). The proposed approach was evaluated on the reputational polarity task using RepLab 2013 dataset. Compared to previous works, we achieved 5.8% improvement in accuracy, 26.9% improvement in balanced accuracy, and 21.8% improvement in terms of F-score.
Abstract:Anomaly detection is critically important for intelligent surveillance systems to detect in a timely manner any malicious activities. Many video anomaly detection approaches using deep learning methods focus on a single camera video stream with a fixed scenario. These deep learning methods use large-scale training data with large complexity. As a solution, in this paper, we show how to use pre-trained convolutional neural net models to perform feature extraction and context mining, and then use denoising autoencoder with relatively low model complexity to provide efficient and accurate surveillance anomaly detection, which can be useful for the resource-constrained devices such as edge devices of the Internet of Things (IoT). Our anomaly detection model makes decisions based on the high-level features derived from the selected embedded computer vision models such as object classification and object detection. Additionally, we derive contextual properties from the high-level features to further improve the performance of our video anomaly detection method. We use two UCSD datasets to demonstrate that our approach with relatively low model complexity can achieve comparable performance compared to the state-of-the-art approaches.