Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Error Feedback Can Accurately Compress Preconditioners

Jun 16, 2023

Ionut-Vlad Modoranu, Aleksei Kalinov, Eldar Kurtic, Dan Alistarh

Figure 1 for Error Feedback Can Accurately Compress Preconditioners

Figure 2 for Error Feedback Can Accurately Compress Preconditioners

Figure 3 for Error Feedback Can Accurately Compress Preconditioners

Figure 4 for Error Feedback Can Accurately Compress Preconditioners

Share this with someone who'll enjoy it:

Abstract:Leveraging second-order information at the scale of deep networks is one of the main lines of approach for improving the performance of current optimizers for deep learning. Yet, existing approaches for accurate full-matrix preconditioning, such as Full-Matrix Adagrad (GGT) or Matrix-Free Approximate Curvature (M-FAC) suffer from massive storage costs when applied even to medium-scale models, as they must store a sliding window of gradients, whose memory requirements are multiplicative in the model dimension. In this paper, we address this issue via an efficient and simple-to-implement error-feedback technique that can be applied to compress preconditioners by up to two orders of magnitude in practice, without loss of convergence. Specifically, our approach compresses the gradient information via sparsification or low-rank compression \emph{before} it is fed into the preconditioner, feeding the compression error back into future iterations. Extensive experiments on deep neural networks for vision show that this approach can compress full-matrix preconditioners by up to two orders of magnitude without impact on accuracy, effectively removing the memory overhead of full-matrix preconditioning for implementations of full-matrix Adagrad (GGT) and natural gradient (M-FAC). Our code is available at https://github.com/IST-DASLab/EFCP.

View paper on

Share this with someone who'll enjoy it:

Title:Error Feedback Can Accurately Compress Preconditioners

Paper and Code