Abstract:Anomaly-based cyber threat detection using deep learning is on a constant growth in popularity for novel cyber-attack detection and forensics. A robust, efficient, and real-time threat detector in a large-scale operational enterprise network requires high accuracy, high fidelity, and a high throughput model to detect malicious activities. Traditional anomaly-based detection models, however, suffer from high computational overhead and low detection accuracy, making them unsuitable for real-time threat detection. In this work, we propose LogSHIELD, a highly effective graph-based anomaly detection model in host data. We present a real-time threat detection approach using frequency-domain analysis of provenance graphs. To demonstrate the significance of graph-based frequency analysis we proposed two approaches. Approach-I uses a Graph Neural Network (GNN) LogGNN and approach-II performs frequency domain analysis on graph node samples for graph embedding. Both approaches use a statistical clustering algorithm for anomaly detection. The proposed models are evaluated using a large host log dataset consisting of 774M benign logs and 375K malware logs. LogSHIELD explores the provenance graph to extract contextual and causal relationships among logs, exposing abnormal activities. It can detect stealthy and sophisticated attacks with over 98% average AUC and F1 scores. It significantly improves throughput, achieves an average detection latency of 0.13 seconds, and outperforms state-of-the-art models in detection time.