Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Simon Mandlik

Malicious Internet Entity Detection Using Local Graph Inference

Aug 07, 2024

Simon Mandlik, Tomas Pevny, Vaclav Smidl, Lukas Bajer

Abstract:Detection of malicious behavior in a large network is a challenging problem for machine learning in computer security, since it requires a model with high expressive power and scalable inference. Existing solutions struggle to achieve this feat -- current cybersec-tailored approaches are still limited in expressivity, and methods successful in other domains do not scale well for large volumes of data, rendering frequent retraining impossible. This work proposes a new perspective for learning from graph data that is modeling network entity interactions as a large heterogeneous graph. High expressivity of the method is achieved with neural network architecture HMILnet that naturally models this type of data and provides theoretical guarantees. The scalability is achieved by pursuing local graph inference, i.e., classifying individual vertices and their neighborhood as independent samples. Our experiments exhibit improvement over the state-of-the-art Probabilistic Threat Propagation (PTP) algorithm, show a further threefold accuracy improvement when additional data is used, which is not possible with the PTP algorithm, and demonstrate the generalization capabilities of the method to new, previously unseen entities.

* A preprint. Full publication: https://ieeexplore.ieee.org/document/10418120

Via

Access Paper or Ask Questions

Mill.jl and JsonGrinder.jl: automated differentiable feature extraction for learning from raw JSON data

May 19, 2021

Simon Mandlik, Matej Racinsky, Viliam Lisy, Tomas Pevny

Figure 1 for Mill.jl and JsonGrinder.jl: automated differentiable feature extraction for learning from raw JSON data

Figure 2 for Mill.jl and JsonGrinder.jl: automated differentiable feature extraction for learning from raw JSON data

Abstract:Learning from raw data input, thus limiting the need for manual feature engineering, is one of the key components of many successful applications of machine learning methods. While machine learning problems are often formulated on data that naturally translate into a vector representation suitable for classifiers, there are data sources, for example in cybersecurity, that are naturally represented in diverse files with a unifying hierarchical structure, such as XML, JSON, and Protocol Buffers. Converting this data to vector (tensor) representation is generally done by manual feature engineering, which is laborious, lossy, and prone to human bias about the importance of particular features. Mill and JsonGrinder is a tandem of libraries, which fully automates the conversion. Starting with an arbitrary set of JSON samples, they create a differentiable machine learning model capable of infer from further JSON samples in their raw form.

* 5 pages, 2 figures, 1 table, submitted to section on one-source software of Journal of Machine Learning Research

Via

Access Paper or Ask Questions