Abstract:Robust network security systems are essential to prevent and mitigate the harming effects of the ever-growing occurrence of network attacks. In recent years, machine learning-based systems have gain popularity for network security applications, usually considering the application of shallow models, which rely on the careful engineering of expert, handcrafted input features. The main limitation of this approach is that handcrafted features can fail to perform well under different scenarios and types of attacks. Deep Learning (DL) models can solve this limitation using their ability to learn feature representations from raw, non-processed data. In this paper we explore the power of DL models on the specific problem of detection and classification of malware network traffic. As a major advantage with respect to the state of the art, we consider raw measurements coming directly from the stream of monitored bytes as input to the proposed models, and evaluate different raw-traffic feature representations, including packet and flow-level ones. We introduce DeepMAL, a DL model which is able to capture the underlying statistics of malicious traffic, without any sort of expert handcrafted features. Using publicly available traffic traces containing different families of malware traffic, we show that DeepMAL can detect and classify malware flows with high accuracy, outperforming traditional, shallow-like models.