Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tatsuro Inaba

How LLMs Learn: Tracing Internal Representations with Sparse Autoencoders

Mar 09, 2025

Tatsuro Inaba, Kentaro Inui, Yusuke Miyao, Yohei Oseki, Benjamin Heinzerling, Yu Takagi

Abstract:Large Language Models (LLMs) demonstrate remarkable multilingual capabilities and broad knowledge. However, the internal mechanisms underlying the development of these capabilities remain poorly understood. To investigate this, we analyze how the information encoded in LLMs' internal representations evolves during the training process. Specifically, we train sparse autoencoders at multiple checkpoints of the model and systematically compare the interpretative results across these stages. Our findings suggest that LLMs initially acquire language-specific knowledge independently, followed by cross-linguistic correspondences. Moreover, we observe that after mastering token-level knowledge, the model transitions to learning higher-level, abstract concepts, indicating the development of more conceptual understanding.

* Our code, demo, SAE weights are available at: https://github.com/llm-jp/llm-jp-sae

Via

Access Paper or Ask Questions

Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference

Jan 27, 2025

Go Kamoda, Benjamin Hienzerling, Tatsuro Inaba, Keito Kudo, Keisuke Sakaguchi, Kentaro Inui

Figure 1 for Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference

Figure 2 for Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference

Figure 3 for Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference

Figure 4 for Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference

Abstract:According to the stages-of-inference hypothesis, early layers of language models map their subword-tokenized input, which does not necessarily correspond to a linguistically meaningful segmentation, to more meaningful representations that form the model's ``inner vocabulary''. Prior analysis of this detokenization stage has predominantly relied on probing and interventions such as path patching, which involve selecting particular inputs, choosing a subset of components that will be patched, and then observing changes in model behavior. Here, we show that several important aspects of the detokenization stage can be understood purely by analyzing model weights, without performing any model inference steps. Specifically, we introduce an analytical decomposition of first-layer attention in GPT-2. Our decomposition yields interpretable terms that quantify the relative contributions of position-related, token-related, and mixed effects. By focusing on terms in this decomposition, we discover weight-based explanations of attention bias toward close tokens and attention for detokenization.

* 22 pages, 14 figures, to appear in NAACL Findings 2025

Via

Access Paper or Ask Questions

MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting

May 26, 2023

Tatsuro Inaba, Hirokazu Kiyomaru, Fei Cheng, Sadao Kurohashi

Figure 1 for MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting

Figure 2 for MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting

Figure 3 for MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting

Figure 4 for MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting

Abstract:Large language models (LLMs) have achieved impressive performance on various reasoning tasks. To further improve the performance, we propose MultiTool-CoT, a novel framework that leverages chain-of-thought (CoT) prompting to incorporate multiple external tools, such as a calculator and a knowledge retriever, during the reasoning process. We apply MultiTool-CoT to the Task 2 dataset of NumGLUE, which requires both numerical reasoning and domain-specific knowledge. The experiments show that our method significantly outperforms strong baselines and achieves state-of-the-art performance.

* ACL2023. Our code is available at https://github.com/InabaTatsuro/MultiTool-CoT

Via

Access Paper or Ask Questions