Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Adapting Large Language Models via Reading Comprehension

Sep 18, 2023

Daixuan Cheng, Shaohan Huang, Furu Wei

Figure 1 for Adapting Large Language Models via Reading Comprehension

Figure 2 for Adapting Large Language Models via Reading Comprehension

Figure 3 for Adapting Large Language Models via Reading Comprehension

Figure 4 for Adapting Large Language Models via Reading Comprehension

Share this with someone who'll enjoy it:

Abstract:We explore how continued pre-training on domain-specific corpora influences large language models, revealing that training on the raw corpora endows the model with domain knowledge, but drastically hurts its prompting ability for question answering. Taken inspiration from human learning via reading comprehension--practice after reading improves the ability to answer questions based on the learned knowledge--we propose a simple method for transforming raw corpora into reading comprehension texts. Each raw text is enriched with a series of tasks related to its content. Our method, highly scalable and applicable to any pre-training corpora, consistently enhances performance across various tasks in three different domains: biomedicine, finance, and law. Notably, our 7B language model achieves competitive performance with domain-specific models of much larger scales, such as BloombergGPT-50B. Furthermore, we demonstrate that domain-specific reading comprehension texts can improve the model's performance even on general benchmarks, showing the potential to develop a general model across even more domains. Our model, code, and data will be available at https://github.com/microsoft/LMOps.

View paper on

Share this with someone who'll enjoy it:

Title:Adapting Large Language Models via Reading Comprehension

Paper and Code