Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Do self-supervised speech models develop human-like perception biases?

May 31, 2022

Juliette Millet, Ewan Dunbar

Figure 1 for Do self-supervised speech models develop human-like perception biases?

Figure 2 for Do self-supervised speech models develop human-like perception biases?

Figure 3 for Do self-supervised speech models develop human-like perception biases?

Figure 4 for Do self-supervised speech models develop human-like perception biases?

Share this with someone who'll enjoy it:

Abstract:Self-supervised models for speech processing form representational spaces without using any external labels. Increasingly, they appear to be a feasible way of at least partially eliminating costly manual annotations, a problem of particular concern for low-resource languages. But what kind of representational spaces do these models construct? Human perception specializes to the sounds of listeners' native languages. Does the same thing happen in self-supervised models? We examine the representational spaces of three kinds of state-of-the-art self-supervised models: wav2vec 2.0, HuBERT and contrastive predictive coding (CPC), and compare them with the perceptual spaces of French-speaking and English-speaking human listeners, both globally and taking account of the behavioural differences between the two language groups. We show that the CPC model shows a small native language effect, but that wav2vec 2.0 and HuBERT seem to develop a universal speech perception space which is not language specific. A comparison against the predictions of supervised phone recognisers suggests that all three self-supervised models capture relatively fine-grained perceptual phenomena, while supervised models are better at capturing coarser, phone-level, effects of listeners' native language, on perception.

* 2022. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7591-7605, Dublin, Ireland. Association for Computational Linguistics

View paper on

Share this with someone who'll enjoy it:

Title:Do self-supervised speech models develop human-like perception biases?

Paper and Code