Abstract:Malicious cyber activity is ubiquitous and its harmful effects have dramatic and often irreversible impacts on society. Given the shortage of cybersecurity professionals, the ever-evolving adversary, the massive amounts of data which could contain evidence of an attack, and the speed at which defensive actions must be taken, innovations which enable autonomy in cybersecurity must continue to expand, in order to move away from a reactive defense posture and towards a more proactive one. The challenges in this space are quite different from those associated with applying AI in other domains such as computer vision. The environment suffers from an incredibly high degree of uncertainty, stemming from the intractability of ingesting all the available data, as well as the possibility that malicious actors are manipulating the data. Another unique challenge in this space is the dynamism of the adversary causes the indicators of compromise to change frequently and without warning. In spite of these challenges, machine learning has been applied to this domain and has achieved some success in the realm of detection. While this aspect of the problem is far from solved, a growing part of the commercial sector is providing ML-enhanced capabilities as a service. Many of these entities also provide platforms which facilitate the deployment of these automated solutions. Academic research in this space is growing and continues to influence current solutions, as well as strengthen foundational knowledge which will make autonomous agents in this space a possibility.
Abstract:Intrusion Detection Systems (IDS) enhanced with Machine Learning (ML) have demonstrated the capacity to efficiently build a prototype of "normal" cyber behaviors in order to detect cyber threats' activity with greater accuracy than traditional rule-based IDS. Because these are largely black boxes, their acceptance requires proof of robustness to stealthy adversaries. Since it is impossible to build a baseline from activity completely clean of that of malicious cyber actors (outside of controlled experiments), the training data for deployed models will be poisoned with examples of activity that analysts would want to be alerted about. We train an autoencoder-based anomaly detection system on network activity with various proportions of malicious activity mixed in and demonstrate that they are robust to this sort of poisoning.