Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks

Jun 13, 2024

Amit Meghanani, Thomas Hain

Figure 1 for LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks

Figure 2 for LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks

Figure 3 for LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks

Figure 4 for LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks

Share this with someone who'll enjoy it:

Abstract:Self-supervised learning (SSL)-based speech models are extensively used for full-stack speech processing. However, it has been observed that improving SSL-based speech representations using unlabeled speech for content-related tasks is challenging and computationally expensive. Recent attempts have been made to address this issue with cost-effective self-supervised fine-tuning (SSFT) approaches. Continuing in this direction, a cost-effective SSFT method named "LASER: Learning by Aligning Self-supervised Representations" is presented. LASER is based on the soft-DTW alignment loss with temporal regularisation term. Experiments are conducted with HuBERT and WavLM models and evaluated on the SUPERB benchmark for two content-related tasks: automatic speech recognition (ASR) and phoneme recognition (PR). A relative improvement of 3.7% and 8.2% for HuBERT, and 4.1% and 11.7% for WavLM are observed, for the ASR and PR tasks respectively, with only < 3 hours of fine-tuning on a single GPU.

* Accepted at Interspeech 2024

View paper on

Share this with someone who'll enjoy it:

Title:LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks

Paper and Code