Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:How Large Language Models Encode Context Knowledge? A Layer-Wise Probing Study

Mar 04, 2024

Tianjie Ju, Weiwei Sun, Wei Du, Xinwei Yuan, Zhaochun Ren, Gongshen Liu

Figure 1 for How Large Language Models Encode Context Knowledge? A Layer-Wise Probing Study

Figure 2 for How Large Language Models Encode Context Knowledge? A Layer-Wise Probing Study

Figure 3 for How Large Language Models Encode Context Knowledge? A Layer-Wise Probing Study

Figure 4 for How Large Language Models Encode Context Knowledge? A Layer-Wise Probing Study

Share this with someone who'll enjoy it:

Abstract:Previous work has showcased the intriguing capability of large language models (LLMs) in retrieving facts and processing context knowledge. However, only limited research exists on the layer-wise capability of LLMs to encode knowledge, which challenges our understanding of their internal mechanisms. In this paper, we devote the first attempt to investigate the layer-wise capability of LLMs through probing tasks. We leverage the powerful generative capability of ChatGPT to construct probing datasets, providing diverse and coherent evidence corresponding to various facts. We employ $\mathcal V$-usable information as the validation metric to better reflect the capability in encoding context knowledge across different layers. Our experiments on conflicting and newly acquired knowledge show that LLMs: (1) prefer to encode more context knowledge in the upper layers; (2) primarily encode context knowledge within knowledge-related entity tokens at lower layers while progressively expanding more knowledge within other tokens at upper layers; and (3) gradually forget the earlier context knowledge retained within the intermediate layers when provided with irrelevant evidence. Code is publicly available at https://github.com/Jometeorie/probing_llama.

* Accepted at LREC-COLING 2024 (Long Paper)

View paper on

Share this with someone who'll enjoy it:

Title:How Large Language Models Encode Context Knowledge? A Layer-Wise Probing Study

Paper and Code