Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:TQCompressor: improving tensor decomposition methods in neural networks via permutations

Jan 29, 2024

V. Abronin, A. Naumov, D. Mazur, D. Bystrov, K. Tsarova, Ar. Melnikov, I. Oseledets, S. Dolgov, R. Brasher, M. Perelshtein

Figure 1 for TQCompressor: improving tensor decomposition methods in neural networks via permutations

Figure 2 for TQCompressor: improving tensor decomposition methods in neural networks via permutations

Figure 3 for TQCompressor: improving tensor decomposition methods in neural networks via permutations

Figure 4 for TQCompressor: improving tensor decomposition methods in neural networks via permutations

Share this with someone who'll enjoy it:

Abstract:We introduce TQCompressor, a novel method for neural network model compression with improved tensor decompositions. We explore the challenges posed by the computational and storage demands of pre-trained language models in NLP tasks and propose a permutation-based enhancement to Kronecker decomposition. This enhancement makes it possible to reduce loss in model expressivity which is usually associated with factorization. We demonstrate this method applied to the GPT-2$_{small}$. The result of the compression is TQCompressedGPT-2 model, featuring 81 mln. parameters compared to 124 mln. in the GPT-2$_{small}$. We make TQCompressedGPT-2 publicly available. We further enhance the performance of the TQCompressedGPT-2 through a training strategy involving multi-step knowledge distillation, using only a 3.1% of the OpenWebText. TQCompressedGPT-2 surpasses DistilGPT-2 and KnGPT-2 in comparative evaluations, marking an advancement in the efficient and effective deployment of models in resource-constrained environments.

View paper on

Share this with someone who'll enjoy it:

Title:TQCompressor: improving tensor decomposition methods in neural networks via permutations

Paper and Code