Abstract:System auditing has emerged as a key approach for monitoring system call events and investigating sophisticated attacks. Based on the collected audit logs, research has proposed to search for attack patterns or track the causal dependencies of system events to reveal the attack sequence. However, existing approaches either cannot reveal long-range attack sequences or suffer from the dependency explosion problem due to a lack of focus on attack-relevant parts, and thus are insufficient for investigating complex attacks. To bridge the gap, we propose Zebra, a system that synergistically integrates attack pattern search and causal dependency tracking for efficient attack investigation. With Zebra, security analysts can alternate between search and tracking to reveal the entire attack sequence in a progressive, user-guided manner, while mitigating the dependency explosion problem by prioritizing the attack-relevant parts. To enable this, Zebra provides (1) an expressive and concise domain-specific language, Tstl, for performing various types of search and tracking analyses, and (2) an optimized language execution engine for efficient execution over a big amount of auditing data. Evaluations on a broad set of attack cases demonstrate the effectiveness of Zebra in facilitating a timely attack investigation.
Abstract:Log-based cyber threat hunting has emerged as an important solution to counter sophisticated cyber attacks. However, existing approaches require non-trivial efforts of manual query construction and have overlooked the rich external knowledge about threat behaviors provided by open-source Cyber Threat Intelligence (OSCTI). To bridge the gap, we build ThreatRaptor, a system that facilitates cyber threat hunting in computer systems using OSCTI. Built upon mature system auditing frameworks, ThreatRaptor provides (1) an unsupervised, light-weight, and accurate NLP pipeline that extracts structured threat behaviors from unstructured OSCTI text, (2) a concise and expressive domain-specific query language, TBQL, to hunt for malicious system activities, (3) a query synthesis mechanism that automatically synthesizes a TBQL query from the extracted threat behaviors, and (4) an efficient query execution engine to search the big system audit logging data.