Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Universal Paralinguistic Speech Representations Using Self-Supervised Conformers

Oct 09, 2021

Joel Shor, Aren Jansen, Wei Han, Daniel Park, Yu Zhang

Figure 1 for Universal Paralinguistic Speech Representations Using Self-Supervised Conformers

Figure 2 for Universal Paralinguistic Speech Representations Using Self-Supervised Conformers

Figure 3 for Universal Paralinguistic Speech Representations Using Self-Supervised Conformers

Figure 4 for Universal Paralinguistic Speech Representations Using Self-Supervised Conformers

Share this with someone who'll enjoy it:

Abstract:Many speech applications require understanding aspects beyond the words being spoken, such as recognizing emotion, detecting whether the speaker is wearing a mask, or distinguishing real from synthetic speech. In this work, we introduce a new state-of-the-art paralinguistic representation derived from large-scale, fully self-supervised training of a 600M+ parameter Conformer-based architecture. We benchmark on a diverse set of speech tasks and demonstrate that simple linear classifiers trained on top of our time-averaged representation outperform nearly all previous results, in some cases by large margins. Our analyses of context-window size demonstrate that, surprisingly, 2 second context-windows achieve 98% the performance of the Conformers that use the full long-term context. Furthermore, while the best per-task representations are extracted internally in the network, stable performance across several layers allows a single universal representation to reach near optimal performance on all tasks.

View paper on

Share this with someone who'll enjoy it:

Title:Universal Paralinguistic Speech Representations Using Self-Supervised Conformers

Paper and Code