Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:GACT: Activation Compressed Training for General Architectures

Jun 28, 2022

Xiaoxuan Liu, Lianmin Zheng, Dequan Wang, Yukuo Cen, Weize Chen, Xu Han, Jianfei Chen, Zhiyuan Liu, Jie Tang, Joey Gonzalez(+2 more)

Figure 1 for GACT: Activation Compressed Training for General Architectures

Figure 2 for GACT: Activation Compressed Training for General Architectures

Figure 3 for GACT: Activation Compressed Training for General Architectures

Figure 4 for GACT: Activation Compressed Training for General Architectures

Share this with someone who'll enjoy it:

Abstract:Training large neural network (NN) models requires extensive memory resources, and Activation Compressed Training (ACT) is a promising approach to reduce training memory footprint. This paper presents GACT, an ACT framework to support a broad range of machine learning tasks for generic NN architectures with limited domain knowledge. By analyzing a linearized version of ACT's approximate gradient, we prove the convergence of GACT without prior knowledge on operator type or model architecture. To make training stable, we propose an algorithm that decides the compression ratio for each tensor by estimating its impact on the gradient at run time. We implement GACT as a PyTorch library that readily applies to any NN architecture. GACT reduces the activation memory for convolutional NNs, transformers, and graph NNs by up to 8.1x, enabling training with a 4.2x to 24.7x larger batch size, with negligible accuracy loss.

View paper on

Share this with someone who'll enjoy it:

Title:GACT: Activation Compressed Training for General Architectures

Paper and Code