Abstract:A key challenge associated with Kubernetes configuration files (KCFs) is that they are often highly complex and error-prone, leading to security vulnerabilities and operational setbacks. Rule-based (RB) tools for KCF misconfiguration detection rely on static rule sets, making them inherently limited and unable to detect newly-discovered misconfigurations. RB tools also suffer from misdetection, since mistakes are likely when coding the detection rules. Recent methods for detecting and remediating KCF misconfigurations are limited in terms of their scalability and detection coverage, or due to the fact that they have high expertise requirements and do not offer automated remediation along with misconfiguration detection. Novel approaches that employ LLMs in their pipeline rely on API-based, general-purpose, and mainly commercial models. Thus, they pose security challenges, have inconsistent classification performance, and can be costly. In this paper, we propose GenKubeSec, a comprehensive and adaptive, LLM-based method, which, in addition to detecting a wide variety of KCF misconfigurations, also identifies the exact location of the misconfigurations and provides detailed reasoning about them, along with suggested remediation. When empirically compared with three industry-standard RB tools, GenKubeSec achieved equivalent precision (0.990) and superior recall (0.999). When a random sample of KCFs was examined by a Kubernetes security expert, GenKubeSec's explanations as to misconfiguration localization, reasoning and remediation were 100% correct, informative and useful. To facilitate further advancements in this domain, we share the unique dataset we collected, a unified misconfiguration index we developed for label standardization, our experimentation code, and GenKubeSec itself as an open-source tool.
Abstract:Although home IoT (Internet of Things) devices are typically plain and task oriented, the context of their daily use may affect their traffic patterns. For this reason, anomaly-based intrusion detection systems tend to suffer from a high false positive rate (FPR). To overcome this, we propose a two-step collaborative anomaly detection method which first uses an autoencoder to differentiate frequent (`benign') and infrequent (possibly `malicious') traffic flows. Clustering is then used to analyze only the infrequent flows and classify them as either known ('rare yet benign') or unknown (`malicious'). Our method is collaborative, in that (1) normal behaviors are characterized more robustly, as they take into account a variety of user interactions and network topologies, and (2) several features are computed based on a pool of identical devices rather than just the inspected device. We evaluated our method empirically, using 21 days of real-world traffic data that emanated from eight identical IoT devices deployed on various networks, one of which was located in our controlled lab where we implemented two popular IoT-related cyber-attacks. Our collaborative anomaly detection method achieved a macro-average area under the precision-recall curve of 0.841, an F1 score of 0.929, and an FPR of only 0.014. These promising results were obtained by using labeled traffic data from our lab as the test set, while training the models on the traffic of devices deployed outside the lab, and thus demonstrate a high level of generalizability. In addition to its high generalizability and promising performance, our proposed method also offers benefits such as privacy preservation, resource savings, and model poisoning mitigation. On top of that, as a contribution to the scientific community, our novel dataset is available online.
Abstract:Today, telecommunication service providers (telcos) are exposed to cyber-attacks executed by compromised IoT devices connected to their customers' networks. Such attacks might have severe effects not only on the target of attacks but also on the telcos themselves. To mitigate those risks we propose a machine learning based method that can detect devices of specific vulnerable IoT models connected behind a domestic NAT, thereby identifying home networks that pose a risk to the telco's infrastructure and availability of services. As part of the effort to preserve the domestic customers' privacy, our method relies on NetFlow data solely, refraining from inspecting the payload. To promote future research in this domain we share our novel dataset, collected in our lab from numerous and various commercial IoT devices.
Abstract:The proliferation of IoT devices which can be more easily compromised than desktop computers has led to an increase in the occurrence of IoT based botnet attacks. In order to mitigate this new threat there is a need to develop new methods for detecting attacks launched from compromised IoT devices and differentiate between hour and millisecond long IoTbased attacks. In this paper we propose and empirically evaluate a novel network based anomaly detection method which extracts behavior snapshots of the network and uses deep autoencoders to detect anomalous network traffic emanating from compromised IoT devices. To evaluate our method, we infected nine commercial IoT devices in our lab with two of the most widely known IoT based botnets, Mirai and BASHLITE. Our evaluation results demonstrated our proposed method's ability to accurately and instantly detect the attacks as they were being launched from the compromised IoT devices which were part of a botnet.
Abstract:Security experts have demonstrated numerous risks imposed by Internet of Things (IoT) devices on organizations. Due to the widespread adoption of such devices, their diversity, standardization obstacles, and inherent mobility, organizations require an intelligent mechanism capable of automatically detecting suspicious IoT devices connected to their networks. In particular, devices not included in a white list of trustworthy IoT device types (allowed to be used within the organizational premises) should be detected. In this research, Random Forest, a supervised machine learning algorithm, was applied to features extracted from network traffic data with the aim of accurately identifying IoT device types from the white list. To train and evaluate multi-class classifiers, we collected and manually labeled network traffic data from 17 distinct IoT devices, representing nine types of IoT devices. Based on the classification of 20 consecutive sessions and the use of majority rule, IoT device types that are not on the white list were correctly detected as unknown in 96% of test cases (on average), and white listed device types were correctly classified by their actual types in 99% of cases. Some IoT device types were identified quicker than others (e.g., sockets and thermostats were successfully detected within five TCP sessions of connecting to the network). Perfect detection of unauthorized IoT device types was achieved upon analyzing 110 consecutive sessions; perfect classification of white listed types required 346 consecutive sessions, 110 of which resulted in 99.49% accuracy. Further experiments demonstrated the successful applicability of classifiers trained in one location and tested on another. In addition, a discussion is provided regarding the resilience of our machine learning-based IoT white listing method to adversarial attacks.