Abstract:In this paper, we revisit the challenging problem of unsupervised single-document summarization and study the following aspects: Integer linear programming (ILP) based algorithms, Parameterized normalization of term and sentence scores, and Title-driven approaches for summarization. We describe a new framework, NewsSumm, that includes many existing and new approaches for summarization including ILP and title-driven approaches. NewsSumm's flexibility allows to combine different algorithms and sentence scoring schemes seamlessly. Our results combining sentence scoring with ILP and normalization are in contrast to previous work on this topic, showing the importance of a broader search for optimal parameters. We also show that the new title-driven reduction idea leads to improvement in performance for both unsupervised and supervised approaches considered.
Abstract:Advanced machine learning and natural language techniques enable attackers to launch sophisticated and targeted social engineering-based attacks. To counter the active attacker issue, researchers have since resorted to proactive methods of detection. Email masquerading using targeted emails to fool the victim is an advanced attack method. However automatic text generation requires controlling the context and coherency of the generated content, which has been identified as an increasingly difficult problem. The method used leverages a hierarchical deep neural model which uses a learned representation of the sentences in the input document to generate structured written emails. We demonstrate the generation of short and targeted text messages using the deep model. The global coherency of the synthesized text is evaluated using a qualitative study as well as multiple quantitative measures.
Abstract:With an increasing number of malicious attacks, the number of people and organizations falling prey to social engineering attacks is proliferating. Despite considerable research in mitigation systems, attackers continually improve their modus operandi by using sophisticated machine learning, natural language processing techniques with an intent to launch successful targeted attacks aimed at deceiving detection mechanisms as well as the victims. We propose a system for advanced email masquerading attacks using Natural Language Generation (NLG) techniques. Using legitimate as well as an influx of varying malicious content, the proposed deep learning system generates \textit{fake} emails with malicious content, customized depending on the attacker's intent. The system leverages Recurrent Neural Networks (RNNs) for automated text generation. We also focus on the performance of the generated emails in defeating statistical detectors, and compare and analyze the emails using a proposed baseline.