Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hongsong Zhu

When LLMs Meet Cybersecurity: A Systematic Literature Review

May 06, 2024

Jie Zhang, Haoyu Bu, Hui Wen, Yu Chen, Lun Li, Hongsong Zhu

Abstract:The rapid advancements in large language models (LLMs) have opened new avenues across various fields, including cybersecurity, which faces an ever-evolving threat landscape and need for innovative technologies. Despite initial explorations into the application of LLMs in cybersecurity, there is a lack of a comprehensive overview of this research area. This paper bridge this gap by providing a systematic literature review, encompassing an analysis of over 180 works, spanning across 25 LLMs and more than 10 downstream scenarios. Our comprehensive overview addresses three critical research questions: the construction of cybersecurity-oriented LLMs, LLMs' applications in various cybersecurity tasks, and the existing challenges and further research in this area. This study aims to shed light on the extensive potential of LLMs in enhancing cybersecurity practices, and serve as a valuable resource for applying LLMs in this doamin. We also maintain and regularly updated list of practical guides on LLMs for cybersecurity at https://github.com/tmylla/Awesome-LLM4Cybersecurity.

* 36 pages, 7 figures

Via

Access Paper or Ask Questions

Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning

Apr 10, 2024

Peipei Liu, Gaosheng Wang, Ying Tong, Jian Liang, Zhenquan Ding, Hongsong Zhu

Abstract:Few-shot named entity recognition can identify new types of named entities based on a few labeled examples. Previous methods employing token-level or span-level metric learning suffer from the computational burden and a large number of negative sample spans. In this paper, we propose the Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning (MsFNER), which splits the general NER into two stages: entity-span detection and entity classification. There are 3 processes for introducing MsFNER: training, finetuning, and inference. In the training process, we train and get the best entity-span detection model and the entity classification model separately on the source domain using meta-learning, where we create a contrastive learning module to enhance entity representations for entity classification. During finetuning, we finetune the both models on the support dataset of target domain. In the inference process, for the unlabeled data, we first detect the entity-spans, then the entity-spans are jointly determined by the entity classification model and the KNN. We conduct experiments on the open FewNERD dataset and the results demonstrate the advance of MsFNER.

Via

Access Paper or Ask Questions

A Novel Framework for Multimodal Named Entity Recognition with Multi-level Alignments

May 15, 2023

Peipei Liu, Hong Li, Yimo Ren, Jie Liu, Shuaizong Si, Hongsong Zhu, Limin Sun

Figure 1 for A Novel Framework for Multimodal Named Entity Recognition with Multi-level Alignments

Figure 2 for A Novel Framework for Multimodal Named Entity Recognition with Multi-level Alignments

Figure 3 for A Novel Framework for Multimodal Named Entity Recognition with Multi-level Alignments

Figure 4 for A Novel Framework for Multimodal Named Entity Recognition with Multi-level Alignments

Abstract:Mining structured knowledge from tweets using named entity recognition (NER) can be beneficial for many downstream applications such as recommendation and intention under standing. With tweet posts tending to be multimodal, multimodal named entity recognition (MNER) has attracted more attention. In this paper, we propose a novel approach, which can dynamically align the image and text sequence and achieve the multi-level cross-modal learning to augment textual word representation for MNER improvement. To be specific, our framework can be split into three main stages: the first stage focuses on intra-modality representation learning to derive the implicit global and local knowledge of each modality, the second evaluates the relevance between the text and its accompanying image and integrates different grained visual information based on the relevance, the third enforces semantic refinement via iterative cross-modal interactions and co-attention. We conduct experiments on two open datasets, and the results and detailed analysis demonstrate the advantage of our model.

Via

Access Paper or Ask Questions

Improving the Modality Representation with Multi-View Contrastive Learning for Multimodal Sentiment Analysis

Oct 28, 2022

Peipei Liu, Xin Zheng, Hong Li, Jie Liu, Yimo Ren, Hongsong Zhu, Limin Sun

Figure 1 for Improving the Modality Representation with Multi-View Contrastive Learning for Multimodal Sentiment Analysis

Figure 2 for Improving the Modality Representation with Multi-View Contrastive Learning for Multimodal Sentiment Analysis

Figure 3 for Improving the Modality Representation with Multi-View Contrastive Learning for Multimodal Sentiment Analysis

Abstract:Modality representation learning is an important problem for multimodal sentiment analysis (MSA), since the highly distinguishable representations can contribute to improving the analysis effect. Previous works of MSA have usually focused on multimodal fusion strategies, and the deep study of modal representation learning was given less attention. Recently, contrastive learning has been confirmed effective at endowing the learned representation with stronger discriminate ability. Inspired by this, we explore the improvement approaches of modality representation with contrastive learning in this study. To this end, we devise a three-stages framework with multi-view contrastive learning to refine representations for the specific objectives. At the first stage, for the improvement of unimodal representations, we employ the supervised contrastive learning to pull samples within the same class together while the other samples are pushed apart. At the second stage, a self-supervised contrastive learning is designed for the improvement of the distilled unimodal representations after cross-modal interaction. At last, we leverage again the supervised contrastive learning to enhance the fused multimodal representation. After all the contrast trainings, we next achieve the classification task based on frozen representations. We conduct experiments on three open datasets, and results show the advance of our model.

Via

Access Paper or Ask Questions

CEntRE: A paragraph-level Chinese dataset for Relation Extraction among Enterprises

Oct 19, 2022

Peipei Liu, Hong Li, Zhiyu Wang, Yimo Ren, Jie Liu, Fei Lyu, Hongsong Zhu, Limin Sun

Figure 1 for CEntRE: A paragraph-level Chinese dataset for Relation Extraction among Enterprises

Figure 2 for CEntRE: A paragraph-level Chinese dataset for Relation Extraction among Enterprises

Figure 3 for CEntRE: A paragraph-level Chinese dataset for Relation Extraction among Enterprises

Figure 4 for CEntRE: A paragraph-level Chinese dataset for Relation Extraction among Enterprises

Abstract:Enterprise relation extraction aims to detect pairs of enterprise entities and identify the business relations between them from unstructured or semi-structured text data, and it is crucial for several real-world applications such as risk analysis, rating research and supply chain security. However, previous work mainly focuses on getting attribute information about enterprises like personnel and corporate business, and pays little attention to enterprise relation extraction. To encourage further progress in the research, we introduce the CEntRE, a new dataset constructed from publicly available business news data with careful human annotation and intelligent data processing. Extensive experiments on CEntRE with six excellent models demonstrate the challenges of our proposed dataset.

Via

Access Paper or Ask Questions

Multi-features based Semantic Augmentation Networks for Named Entity Recognition in Threat Intelligence

Jul 01, 2022

Peipei Liu, Hong Li, Zuoguang Wang, Jie Liu, Yimo Ren, Hongsong Zhu

Figure 1 for Multi-features based Semantic Augmentation Networks for Named Entity Recognition in Threat Intelligence

Figure 2 for Multi-features based Semantic Augmentation Networks for Named Entity Recognition in Threat Intelligence

Figure 3 for Multi-features based Semantic Augmentation Networks for Named Entity Recognition in Threat Intelligence

Figure 4 for Multi-features based Semantic Augmentation Networks for Named Entity Recognition in Threat Intelligence

Abstract:Extracting cybersecurity entities such as attackers and vulnerabilities from unstructured network texts is an important part of security analysis. However, the sparsity of intelligence data resulted from the higher frequency variations and the randomness of cybersecurity entity names makes it difficult for current methods to perform well in extracting security-related concepts and entities. To this end, we propose a semantic augmentation method which incorporates different linguistic features to enrich the representation of input tokens to detect and classify the cybersecurity names over unstructured text. In particular, we encode and aggregate the constituent feature, morphological feature and part of speech feature for each input token to improve the robustness of the method. More than that, a token gets augmented semantic information from its most similar K words in cybersecurity domain corpus where an attentive module is leveraged to weigh differences of the words, and from contextual clues based on a large-scale general field corpus. We have conducted experiments on the cybersecurity datasets DNRTI and MalwareTextDB, and the results demonstrate the effectiveness of the proposed method.

Via

Access Paper or Ask Questions

Threat Detection for General Social Engineering Attack Using Machine Learning Techniques

Mar 17, 2022

Zuoguang Wang, Yimo Ren, Hongsong Zhu, Limin Sun

Figure 1 for Threat Detection for General Social Engineering Attack Using Machine Learning Techniques

Figure 2 for Threat Detection for General Social Engineering Attack Using Machine Learning Techniques

Figure 3 for Threat Detection for General Social Engineering Attack Using Machine Learning Techniques

Figure 4 for Threat Detection for General Social Engineering Attack Using Machine Learning Techniques

Abstract:This paper explores the threat detection for general Social Engineering (SE) attack using Machine Learning (ML) techniques, rather than focusing on or limited to a specific SE attack type, e.g. email phishing. Firstly, this paper processes and obtains more SE threat data from the previous Knowledge Graph (KG), and then extracts different threat features and generates new datasets corresponding with three different feature combinations. Finally, 9 types of ML models are created and trained using the three datasets, respectively, and their performance are compared and analyzed with 27 threat detectors and 270 times of experiments. The experimental results and analyses show that: 1) the ML techniques are feasible in detecting general SE attacks and some ML models are quite effective; ML-based SE threat detection is complementary with KG-based approaches; 2) the generated datasets are usable and the SE domain ontology proposed in previous work can dissect SE attacks and deliver the SE threat features, allowing it to be used as a data model for future research. Besides, more conclusions and analyses about the characteristics of different ML detectors and the datasets are discussed.

Via

Access Paper or Ask Questions

Discontinuous Named Entity Recognition as Maximal Clique Discovery

Jun 01, 2021

Yucheng Wang, Bowen Yu, Hongsong Zhu, Tingwen Liu, Nan Yu, Limin Sun

Figure 1 for Discontinuous Named Entity Recognition as Maximal Clique Discovery

Figure 2 for Discontinuous Named Entity Recognition as Maximal Clique Discovery

Figure 3 for Discontinuous Named Entity Recognition as Maximal Clique Discovery

Figure 4 for Discontinuous Named Entity Recognition as Maximal Clique Discovery

Abstract:Named entity recognition (NER) remains challenging when entity mentions can be discontinuous. Existing methods break the recognition process into several sequential steps. In training, they predict conditioned on the golden intermediate results, while at inference relying on the model output of the previous steps, which introduces exposure bias. To solve this problem, we first construct a segment graph for each sentence, in which each node denotes a segment (a continuous entity on its own, or a part of discontinuous entities), and an edge links two nodes that belong to the same entity. The nodes and edges can be generated respectively in one stage with a grid tagging scheme and learned jointly using a novel architecture named Mac. Then discontinuous NER can be reformulated as a non-parametric process of discovering maximal cliques in the graph and concatenating the spans in each clique. Experiments on three benchmarks show that our method outperforms the state-of-the-art (SOTA) results, with up to 3.5 percentage points improvement on F1, and achieves 5x speedup over the SOTA model.

* ACL 2021

Via

Access Paper or Ask Questions

TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking

Oct 26, 2020

Yucheng Wang, Bowen Yu, Yueyang Zhang, Tingwen Liu, Hongsong Zhu, Limin Sun

Figure 1 for TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking

Figure 2 for TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking

Figure 3 for TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking

Figure 4 for TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking

Abstract:Extracting entities and relations from unstructured text has attracted increasing attention in recent years but remains challenging, due to the intrinsic difficulty in identifying overlapping relations with shared entities. Prior works show that joint learning can result in a noticeable performance gain. However, they usually involve sequential interrelated steps and suffer from the problem of exposure bias. At training time, they predict with the ground truth conditions while at inference it has to make extraction from scratch. This discrepancy leads to error accumulation. To mitigate the issue, we propose in this paper a one-stage joint extraction model, namely, TPLinker, which is capable of discovering overlapping relations sharing one or both entities while immune from the exposure bias. TPLinker formulates joint extraction as a token pair linking problem and introduces a novel handshaking tagging scheme that aligns the boundary tokens of entity pairs under each relation type. Experiment results show that TPLinker performs significantly better on overlapping and multiple relation extraction, and achieves state-of-the-art performance on two public datasets.

* COLING 2020

Via

Access Paper or Ask Questions