Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wangyan Feng

A Deep Belief Network Based Machine Learning System for Risky Host Detection

Dec 29, 2017

Wangyan Feng, Shuning Wu, Xiaodan Li, Kevin Kunkle

Figure 1 for A Deep Belief Network Based Machine Learning System for Risky Host Detection

Figure 2 for A Deep Belief Network Based Machine Learning System for Risky Host Detection

Figure 3 for A Deep Belief Network Based Machine Learning System for Risky Host Detection

Figure 4 for A Deep Belief Network Based Machine Learning System for Risky Host Detection

Abstract:To assure cyber security of an enterprise, typically SIEM (Security Information and Event Management) system is in place to normalize security event from different preventive technologies and flag alerts. Analysts in the security operation center (SOC) investigate the alerts to decide if it is truly malicious or not. However, generally the number of alerts is overwhelming with majority of them being false positive and exceeding the SOC's capacity to handle all alerts. There is a great need to reduce the false positive rate as much as possible. While most previous research focused on network intrusion detection, we focus on risk detection and propose an intelligent Deep Belief Network machine learning system. The system leverages alert information, various security logs and analysts' investigation results in a real enterprise environment to flag hosts that have high likelihood of being compromised. Text mining and graph based method are used to generate targets and create features for machine learning. In the experiment, Deep Belief Network is compared with other machine learning algorithms, including multi-layer neural network, random forest, support vector machine and logistic regression. Results on real enterprise data indicate that the deep belief network machine learning system performs better than other algorithms for our problem and is six times more effective than current rule-based system. We also implement the whole system from data collection, label creation, feature engineering to host score generation in a real enterprise production environment.

* 10 pages, 10 figures. The paper is accepted by IEEE Conference on Communications and Network Security 2017. However, it is not published because either of the authors showed up in the conference

Via

Access Paper or Ask Questions