Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Prabhupad Pradhan

Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness

Jun 06, 2024

Lars Hillebrand, Prabhupad Pradhan, Christian Bauckhage, Rafet Sifa

Figure 1 for Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness

Figure 2 for Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness

Figure 3 for Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness

Figure 4 for Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness

Abstract:We introduce "pointer-guided segment ordering" (SO), a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations in large language models. Our methodology leverages a self-attention-driven pointer network to restore the original sequence of shuffled text segments, addressing the challenge of capturing the structural coherence and contextual dependencies within documents. This pre-training approach is complemented by a fine-tuning methodology that incorporates dynamic sampling, augmenting the diversity of training instances and improving sample efficiency for various downstream applications. We evaluate our method on a diverse set of datasets, demonstrating its efficacy in tasks requiring sequential text classification across scientific literature and financial reporting domains. Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures, leading to state-of-the-art performance in downstream classification tasks.

* 17 pages, 3 figures, 5 tables, accepted at ECML-PKDD 2024

Via

Access Paper or Ask Questions