Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Application of Knowledge Distillation to Multi-task Speech Representation Learning

Oct 29, 2022

Mine Kerpicci, Van Nguyen, Shuhua Zhang, Erik Visser

Share this with someone who'll enjoy it:

Abstract:Model architectures such as wav2vec 2.0 and HuBERT have been proposed to learn speech representations from audio waveforms in a self-supervised manner. When these models are combined with downstream tasks such as speech recognition, they have been shown to provide state-of-the-art performance. However, these models use a large number of parameters, the smallest version of which has about 95 million parameters. This constitutes a challenge for edge AI device deployments. In this paper, we use knowledge distillation to reduce the original model size by about 75% while maintaining similar performance levels. Moreover, we use wav2vec 2.0 and HuBERT models for distillation and present a comprehensive performance analysis through our experiments where we fine-tune the distilled models on single task and multi-task frameworks separately. In particular, our experiments show that fine-tuning the distilled models on keyword spotting and speaker verification tasks result in only 0.1% accuracy and 0.9% equal error rate degradations, respectively.

* Speech representation learning, multitask learning, wav2vec, HuBERT, knowledge distillation

View paper on

Share this with someone who'll enjoy it:

Title:Application of Knowledge Distillation to Multi-task Speech Representation Learning

Paper and Code