Abstract:Cyber threat hunting is the practice of proactively searching for latent threats in a network. Engaging in threat hunting can be difficult due to the volume of network traffic, variety of adversary techniques, and constantly evolving vulnerabilities. To aid analysts in identifying techniques which may be co-occurring as part of a campaign, we present the Technique Inference Engine, a tool to infer tactics, techniques, and procedures (TTPs) which may be related to existing observations of adversarial behavior. We compile the largest (to our knowledge) available dataset of cyber threat intelligence (CTI) reports labeled with relevant TTPs. With the knowledge that techniques are chronically under-reported in CTI, we apply several implicit feedback recommender models to the data in order to predict additional techniques which may be part of a given campaign. We evaluate the results in the context of the cyber analyst's use case and apply t-SNE to visualize the model embeddings. We provide our code and a web interface.
Abstract:Modern pre-trained architectures struggle to retain previous information while undergoing continuous fine-tuning on new tasks. Despite notable progress in continual classification, systems designed for complex vision tasks such as detection or segmentation still struggle to attain satisfactory performance. In this work, we introduce a memory-based detection transformer architecture to adapt a pre-trained DETR-style detector to new tasks while preserving knowledge from previous tasks. We propose a novel localized query function for efficient information retrieval from memory units, aiming to minimize forgetting. Furthermore, we identify a fundamental challenge in continual detection referred to as background relegation. This arises when object categories from earlier tasks reappear in future tasks, potentially without labels, leading them to be implicitly treated as background. This is an inevitable issue in continual detection or segmentation. The introduced continual optimization technique effectively tackles this challenge. Finally, we assess the performance of our proposed system on continual detection benchmarks and demonstrate that our approach surpasses the performance of existing state-of-the-art resulting in 5-7% improvements on MS-COCO and PASCAL-VOC on the task of continual detection.