Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:I3S: Importance Sampling Subspace Selection for Low-Rank Optimization in LLM Pretraining

Feb 09, 2025

Haochen Zhang, Junze Yin, Guanchu Wang, Zirui Liu, Tianyi Zhang, Anshumali Shrivastava, Lin Yang, Vladimir Braverman

Figure 1 for I3S: Importance Sampling Subspace Selection for Low-Rank Optimization in LLM Pretraining

Figure 2 for I3S: Importance Sampling Subspace Selection for Low-Rank Optimization in LLM Pretraining

Figure 3 for I3S: Importance Sampling Subspace Selection for Low-Rank Optimization in LLM Pretraining

Figure 4 for I3S: Importance Sampling Subspace Selection for Low-Rank Optimization in LLM Pretraining

Share this with someone who'll enjoy it:

Abstract:Low-rank optimization has emerged as a promising approach to enabling memory-efficient training of large language models (LLMs). Existing low-rank optimization methods typically project gradients onto a low-rank subspace, reducing the memory cost of storing optimizer states. A key challenge in these methods is identifying suitable subspaces to ensure an effective optimization trajectory. Most existing approaches select the dominant subspace to preserve gradient information, as this intuitively provides the best approximation. However, we find that in practice, the dominant subspace stops changing during pretraining, thereby constraining weight updates to similar subspaces. In this paper, we propose importance sampling subspace selection (I3S) for low-rank optimization, which theoretically offers a comparable convergence rate to the dominant subspace approach. Empirically, we demonstrate that I3S significantly outperforms previous methods in LLM pretraining tasks.

View paper on

Share this with someone who'll enjoy it:

Title:I3S: Importance Sampling Subspace Selection for Low-Rank Optimization in LLM Pretraining

Paper and Code