Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models

May 16, 2023

Boxi Cao, Qiaoyu Tang, Hongyu Lin, Xianpei Han, Jiawei Chen, Tianshu Wang, Le Sun

Figure 1 for Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models

Figure 2 for Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models

Figure 3 for Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models

Figure 4 for Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models

Share this with someone who'll enjoy it:

Abstract:Memory is one of the most essential cognitive functions serving as a repository of world knowledge and episodes of activities. In recent years, large-scale pre-trained language models have shown remarkable memorizing ability. On the contrary, vanilla neural networks without pre-training have been long observed suffering from the catastrophic forgetting problem. To investigate such a retentive-forgetful contradiction and understand the memory mechanism of language models, we conduct thorough experiments by controlling the target knowledge types, the learning strategies and the learning schedules. We find that: 1) Vanilla language models are forgetful; 2) Pre-training leads to retentive language models; 3) Knowledge relevance and diversification significantly influence the memory formation. These conclusions are useful for understanding the abilities of pre-trained language models and shed light on designing and evaluating new learning and inference algorithms of language models.

View paper on

Share this with someone who'll enjoy it:

Title:Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models

Paper and Code