Abstract:Network attacks have been very prevalent as their rate is growing tremendously. Both organization and individuals are now concerned about their confidentiality, integrity and availability of their critical information which are often impacted by network attacks. To that end, several previous machine learning-based intrusion detection methods have been developed to secure network infrastructure from such attacks. In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique to tune the parameters of Support Vector Machine with Gaussian Kernel (SVM-RBF), Random Forest (RF), and k-Nearest Neighbor (k-NN) algorithms. The performance of the considered algorithms is evaluated using the ISCX 2012 dataset. Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.
Abstract:Cloud computing has become a powerful and indispensable technology for complex, high performance and scalable computation. The exponential expansion in the deployment of cloud technology has produced a massive amount of data from a variety of applications, resources and platforms. In turn, the rapid rate and volume of data creation has begun to pose significant challenges for data management and security. The design and deployment of intrusion detection systems (IDS) in the big data setting has, therefore, become a topic of importance. In this paper, we conduct a systematic literature review (SLR) of data mining techniques (DMT) used in IDS-based solutions through the period 2013-2018. We employed criterion-based, purposive sampling identifying 32 articles, which constitute the primary source of the present survey. After a careful investigation of these articles, we identified 17 separate DMTs deployed in an IDS context. This paper also presents the merits and disadvantages of the various works of current research that implemented DMTs and distributed streaming frameworks (DSF) to detect and/or prevent malicious attacks in a big data environment.