Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kelly M. Testa

Automatic Labeling for Entity Extraction in Cyber Security

Jun 09, 2014

Robert A. Bridges, Corinne L. Jones, Michael D. Iannacone, Kelly M. Testa, John R. Goodall

Figure 1 for Automatic Labeling for Entity Extraction in Cyber Security

Figure 2 for Automatic Labeling for Entity Extraction in Cyber Security

Figure 3 for Automatic Labeling for Entity Extraction in Cyber Security

Figure 4 for Automatic Labeling for Entity Extraction in Cyber Security

Abstract:Timely analysis of cyber-security information necessitates automated information extraction from unstructured text. While state-of-the-art extraction methods produce extremely accurate results, they require ample training data, which is generally unavailable for specialized applications, such as detecting security related entities; moreover, manual annotation of corpora is very costly and often not a viable solution. In response, we develop a very precise method to automatically label text from several data sources by leveraging related, domain-specific, structured data and provide public access to a corpus annotated with cyber-security entities. Next, we implement a Maximum Entropy Model trained with the average perceptron on a portion of our corpus ($\sim$750,000 words) and achieve near perfect precision, recall, and accuracy, with training times under 17 seconds.

* 10 pages

Via

Access Paper or Ask Questions