Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Dymshits

Sampling High Throughput Data for Anomaly Detection of Data-Base Activity

Aug 14, 2017

Hagit Grushka-Cohen, Oded Sofer, Ofer Biller, Michael Dymshits, Lior Rokach, Bracha Shapira

Figure 1 for Sampling High Throughput Data for Anomaly Detection of Data-Base Activity

Figure 2 for Sampling High Throughput Data for Anomaly Detection of Data-Base Activity

Abstract:Data leakage and theft from databases is a dangerous threat to organizations. Data Security and Data Privacy protection systems (DSDP) monitor data access and usage to identify leakage or suspicious activities that should be investigated. Because of the high velocity nature of database systems, such systems audit only a portion of the vast number of transactions that take place. Anomalies are investigated by a Security Officer (SO) in order to choose the proper response. In this paper we investigate the effect of sampling methods based on the risk the transaction poses and propose a new method for "combined sampling" for capturing a more varied sample.

* Proceedings of the 11th Pre-ICIS Workshop on Information Security and Privacy, Dublin, Ireland December 10, 2016

Via

Access Paper or Ask Questions

Process Monitoring on Sequences of System Call Count Vectors

Jul 12, 2017

Michael Dymshits, Ben Myara, David Tolpin

Figure 1 for Process Monitoring on Sequences of System Call Count Vectors

Figure 2 for Process Monitoring on Sequences of System Call Count Vectors

Figure 3 for Process Monitoring on Sequences of System Call Count Vectors

Figure 4 for Process Monitoring on Sequences of System Call Count Vectors

Abstract:We introduce a methodology for efficient monitoring of processes running on hosts in a corporate network. The methodology is based on collecting streams of system calls produced by all or selected processes on the hosts, and sending them over the network to a monitoring server, where machine learning algorithms are used to identify changes in process behavior due to malicious activity, hardware failures, or software errors. The methodology uses a sequence of system call count vectors as the data format which can handle large and varying volumes of data. Unlike previous approaches, the methodology introduced in this paper is suitable for distributed collection and processing of data in large corporate networks. We evaluate the methodology both in a laboratory setting on a real-life setup and provide statistics characterizing performance and accuracy of the methodology.

* 5 pages, 4 figures, ICCST 2017

Via

Access Paper or Ask Questions