Abstract:In this paper, we address the knowledge engineering problems for hypothesis generation motivated by applications that require timely exploration of hypotheses under unreliable observations. We looked at two applications: malware detection and intensive care delivery. In intensive care, the goal is to generate plausible hypotheses about the condition of the patient from clinical observations and further refine these hypotheses to create a recovery plan for the patient. Similarly, preventing malware spread within a corporate network involves generating hypotheses from network traffic data and selecting preventive actions. To this end, building on the already established characterization and use of AI planning for similar problems, we propose use of planning for the hypothesis generation problem. However, to deal with uncertainty, incomplete model description and unreliable observations, we need to use a planner capable of generating multiple high-quality plans. To capture the model description we propose a language called LTS++ and a web-based tool that enables the specification of the LTS++ model and a set of observations. We also proposed a 9-step process that helps provide guidance to the domain expert in specifying the LTS++ model. The hypotheses are then generated by running a planner on the translated LTS++ model and the provided trace. The hypotheses can be visualized and shown to the analyst or can be further investigated automatically.
Abstract:As network traffic monitoring software for cybersecurity, malware detection, and other critical tasks becomes increasingly automated, the rate of alerts and supporting data gathered, as well as the complexity of the underlying model, regularly exceed human processing capabilities. Many of these applications require complex models and constituent rules in order to come up with decisions that influence the operation of entire systems. In this paper, we motivate the novel "strategic planning" problem -- one of gathering data from the world and applying the underlying model of the domain in order to come up with decisions that will monitor the system in an automated manner. We describe our use of automated planning methods to this problem, including the technique that we used to solve it in a manner that would scale to the demands of a real-time, real world scenario. We then present a PDDL model of one such application scenario related to network administration and monitoring, followed by a description of a novel integrated system that was built to accept generated plans and to continue the execution process. Finally, we present evaluations of two different automated planners and their different capabilities with our integrated system, both on a six-month window of network data, and using a simulator.