Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Prasanth Kothuri

Anomaly Detection for Network Connection Logs

Dec 01, 2018

Swapneel Mehta, Prasanth Kothuri, Daniel Lanza Garcia

Abstract:We leverage a streaming architecture based on ELK, Spark and Hadoop in order to collect, store, and analyse database connection logs in near real-time. The proposed system investigates outliers using unsupervised learning; widely adopted clustering and classification algorithms for log data, highlighting the subtle variances in each model by visualisation of outliers. Arriving at a novel solution to evaluate untagged, unfiltered connection logs, we propose an approach that can be extrapolated to a generalised system of analysing connection logs across a large infrastructure comprising thousands of individual nodes and generating hundreds of lines in logs per second.

Via

Access Paper or Ask Questions

A Big Data Architecture for Log Data Storage and Analysis

Dec 01, 2018

Swapneel Mehta, Prasanth Kothuri, Daniel Lanza Garcia

Figure 1 for A Big Data Architecture for Log Data Storage and Analysis

Figure 2 for A Big Data Architecture for Log Data Storage and Analysis

Figure 3 for A Big Data Architecture for Log Data Storage and Analysis

Figure 4 for A Big Data Architecture for Log Data Storage and Analysis

Abstract:We propose an architecture for analysing database connection logs across different instances of databases within an intranet comprising over 10,000 users and associated devices. Our system uses Flume agents to send notifications to a Hadoop Distributed File System for long-term storage and ElasticSearch and Kibana for short-term visualisation, effectively creating a data lake for the extraction of log data. We adopt machine learning models with an ensemble of approaches to filter and process the indicators within the data and aim to predict anomalies or outliers using feature vectors built from this log data.

Via

Access Paper or Ask Questions