Abstract:Network Intrusion Detection Systems (IDS) aim to detect the presence of an intruder by analyzing network packets arriving at an internet connected device. Data-driven deep learning systems, popular due to their superior performance compared to traditional IDS, depend on availability of high quality training data for diverse intrusion classes. A way to overcome this limitation is through transferable learning, where training for one intrusion class can lead to detection of unseen intrusion classes after deployment. In this paper, we provide a detailed study on the transferability of intrusion detection. We investigate practical federated learning configurations to enhance the transferability of intrusion detection. We propose two techniques to significantly improve the transferability of a federated intrusion detection system. The code for this work can be found at https://github.com/ghosh64/transferability.
Abstract:In this paper, we propose a deep-learning-based channel estimation scheme in an orthogonal frequency division multiplexing (OFDM) system. Our proposed method, named Single Slot Recurrence Along Frequency Network (SisRafNet), is based on a novel study of recurrent models for exploiting sequential behavior of channels across frequencies. Utilizing the fact that wireless channels have a high degree of correlation across frequencies, we employ recurrent neural network techniques within a single OFDM slot, thus overcoming the latency and memory constraints typically associated with recurrence based methods. The proposed SisRafNet delivers superior estimation performance compared to existing deep-learning-based channel estimation techniques and the performance has been validated on a wide range of 3rd Generation Partnership Project (3GPP) compliant channel scenarios at multiple signal-to-noise ratios.
Abstract:Deep learning based automatic modulation classification (AMC) has received significant attention owing to its potential applications in both military and civilian use cases. Recently, data-driven subsampling techniques have been utilized to overcome the challenges associated with computational complexity and training time for AMC. Beyond these direct advantages of data-driven subsampling, these methods also have regularizing properties that may improve the adversarial robustness of the modulation classifier. In this paper, we investigate the effects of an adversarial attack on an AMC system that employs deep learning models both for AMC and for subsampling. Our analysis shows that subsampling itself is an effective deterrent to adversarial attacks. We also uncover the most efficient subsampling strategy when an adversarial attack on both the classifier and the subsampler is anticipated.
Abstract:In this paper, we explore transferability in learning between different attack classes in a network intrusion detection setup. We evaluate transferability of attack classes by training a deep learning model with a specific attack class and testing it on a separate attack class. We observe the effects of real and synthetically generated data augmentation techniques on transferability. We investigate the nature of observed transferability relationships, which can be either symmetric or asymmetric. We also examine explainability of the transferability relationships using the recursive feature elimination algorithm. We study data preprocessing techniques to boost model performance. The code for this work can be found at https://github.com/ghosh64/transferability.
Abstract:Accurate disease identification and its severity estimation is an important consideration for disease management. Deep learning-based solutions for disease management using imagery datasets are being increasingly explored by the research community. However, most reported studies have relied on imagery datasets that were acquired under controlled lab conditions. As a result, such models lacked the ability to identify diseases in the field. Therefore, to train a robust deep learning model for field use, an imagery dataset was created using raw images acquired under field conditions using a handheld sensor and augmented images with varying backgrounds. The Corn Disease and Severity (CD&S) dataset consisted of 511, 524, and 562, field acquired raw images, corresponding to three common foliar corn diseases, namely Northern Leaf Blight (NLB), Gray Leaf Spot (GLS), and Northern Leaf Spot (NLS), respectively. For training disease identification models, half of the imagery data for each disease was annotated using bounding boxes and also used to generate 2343 additional images through augmentation using three different backgrounds. For severity estimation, an additional 515 raw images for NLS were acquired and categorized into severity classes ranging from 1 (resistant) to 5 (susceptible). Overall, the CD&S dataset consisted of 4455 total images comprising of 2112 field images and 2343 augmented images.
Abstract:Deep Neural Networks (DNNs) which are trained end-to-end have been successfully applied to solve complex problems that we have not been able to solve in past decades. Autonomous driving is one of the most complex problems which is yet to be completely solved and autonomous racing adds more complexity and exciting challenges to this problem. Towards the challenge of applying end-to-end learning to autonomous racing, this paper shows results on two aspects: (1) Analyzing the relationship between the driving data used for training and the maximum speed at which the DNN can be successfully applied for predicting steering angle, (2) Neural network architecture and training methodology for learning steering and throttle without any feedback or recurrent connections.
Abstract:Gradient-based adversarial attacks on deep neural networks pose a serious threat, since they can be deployed by adding imperceptible perturbations to the test data of any network, and the risk they introduce cannot be assessed through the network's original training performance. Denoising and dimensionality reduction are two distinct methods that have been independently investigated to combat such attacks. While denoising offers the ability to tailor the defense to the specific nature of the attack, dimensionality reduction offers the advantage of potentially removing previously unseen perturbations, along with reducing the training time of the network being defended. We propose strategies to combine the advantages of these two defense mechanisms. First, we propose the cascaded defense, which involves denoising followed by dimensionality reduction. To reduce the training time of the defense for a small trade-off in performance, we propose the hidden layer defense, which involves feeding the output of the encoder of a denoising autoencoder into the network. Further, we discuss how adaptive attacks against these defenses could become significantly weak when an alternative defense is used, or when no defense is used. In this light, we propose a new metric to evaluate a defense which measures the sensitivity of the adaptive attack to modifications in the defense. Finally, we present a guideline for building an ordered repertoire of defenses, a.k.a. a defense infrastructure, that adjusts to limited computational resources in presence of uncertainty about the attack strategy.
Abstract:Automatic modulation classification can be a core component for intelligent spectrally efficient wireless communication networks, and deep learning techniques have recently been shown to deliver superior performance to conventional model-based strategies, particularly when distinguishing between a large number of modulation types. However, such deep learning techniques have also been recently shown to be vulnerable to gradient-based adversarial attacks that rely on subtle input perturbations, which would be particularly feasible in a wireless setting via jamming. One such potent attack is the one known as the Carlini-Wagner attack, which we consider in this work. We further consider a data-driven subsampling setting, where several recently introduced deep-learning-based algorithms are employed to select a subset of samples that lead to reducing the final classifier's training time with minimal loss in accuracy. In this setting, the attacker has to make an assumption about the employed subsampling strategy, in order to calculate the loss gradient. Based on state of the art techniques available to both the attacker and defender, we evaluate best strategies under various assumptions on the knowledge of the other party's strategy. Interestingly, in presence of knowledgeable attackers, we identify computational cost reduction opportunities for the defender with no or minimal loss in performance.
Abstract:In this paper, we propose a framework for predicting frame errors in the collaborative spectrally congested wireless environments of the DARPA Spectrum Collaboration Challenge (SC2) via a recently collected dataset. We employ distributed deep edge learning that is shared among edge nodes and a central cloud. Using this close-to-practice dataset, we find that widely used federated learning approaches, specially those that are privacy preserving, are worse than local training for a wide range of settings. We hence utilize the synthetic minority oversampling technique to maintain privacy via avoiding the transfer of local data to the cloud, and utilize knowledge distillation with an aim to benefit from high cloud computing and storage capabilities. The proposed framework achieves overall better performance than both local and federated training approaches, while being robust against catastrophic failures as well as challenging channel conditions that result in high frame error rates.
Abstract:The Information bottleneck (IB) method enables optimizing over the trade-off between compression of data and prediction accuracy of learned representations, and has successfully and robustly been applied to both supervised and unsupervised representation learning problems. However, IB has several limitations. First, the IB problem is hard to optimize. The IB Lagrangian $\mathcal{L}_{IB}:=I(X;Z)-\beta I(Y;Z)$ is non-convex and existing solutions guarantee only local convergence. As a result, the obtained solutions depend on initialization. Second, the evaluation of a solution is also a challenging task. Conventionally, it resorts to characterizing the information plane, that is, plotting $I(Y;Z)$ versus $I(X;Z)$ for all solutions obtained from different initial points. Furthermore, the IB Lagrangian has phase transitions while varying the multiplier $\beta$. At phase transitions, both $I(X;Z)$ and $I(Y;Z)$ increase abruptly and the rate of convergence becomes significantly slow for existing solutions. Recent works with IB adopt variational surrogate bounds to the IB Lagrangian. Although allowing efficient optimization, how close are these surrogates to the IB Lagrangian is not clear. In this work, we solve the IB Lagrangian using augmented Lagrangian methods. With augmented variables, we show that the IB objective can be solved with the alternating direction method of multipliers (ADMM). Different from prior works, we prove that the proposed algorithm is consistently convergent, regardless of the value of $\beta$. Empirically, our gradient-descent-based method results in information plane points that are denser and comparable to those obtained through the conventional Blahut-Arimoto-based solvers.