Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bogdan Ludusan

ProsAudit, a prosodic benchmark for self-supervised speech models

Feb 24, 2023

Maureen de Seyssel, Marvin Lavechin, Hadrien Titeux, Arthur Thomas, Gwendal Virlet, Andrea Santos Revilla, Guillaume Wisniewski, Bogdan Ludusan, Emmanuel Dupoux

Figure 1 for ProsAudit, a prosodic benchmark for self-supervised speech models

Figure 2 for ProsAudit, a prosodic benchmark for self-supervised speech models

Figure 3 for ProsAudit, a prosodic benchmark for self-supervised speech models

Figure 4 for ProsAudit, a prosodic benchmark for self-supervised speech models

Abstract:We present ProsAudit, a benchmark in English to assess structural prosodic knowledge in self-supervised learning (SSL) speech models. It consists of two subtasks, their corresponding metrics, an evaluation dataset. In the protosyntax task, the model must correctly identify strong versus weak prosodic boundaries. In the lexical task, the model needs to correctly distinguish between pauses inserted between words and within words. We also provide human evaluation scores on this benchmark. We evaluated a series of SSL models and found that they were all able to perform above chance on both tasks, even when trained on an unseen language. However, non-native models performed significantly worse than native ones on the lexical task, highlighting the importance of lexical knowledge in this task. We also found a clear effect of size with models trained on more data performing better in the two subtasks.

* 4 pages + references, 1 figure

Via

Access Paper or Ask Questions

Are words easier to learn from infant- than adult-directed speech? A quantitative corpus-based investigation

Dec 23, 2017

Adriana Guevara-Rukoz, Alejandrina Cristia, Bogdan Ludusan, Roland Thiollière, Andrew Martin, Reiko Mazuka, Emmanuel Dupoux

Figure 1 for Are words easier to learn from infant- than adult-directed speech? A quantitative corpus-based investigation

Figure 2 for Are words easier to learn from infant- than adult-directed speech? A quantitative corpus-based investigation

Figure 3 for Are words easier to learn from infant- than adult-directed speech? A quantitative corpus-based investigation

Figure 4 for Are words easier to learn from infant- than adult-directed speech? A quantitative corpus-based investigation

Abstract:We investigate whether infant-directed speech (IDS) could facilitate word form learning when compared to adult-directed speech (ADS). To study this, we examine the distribution of word forms at two levels, acoustic and phonological, using a large database of spontaneous speech in Japanese. At the acoustic level we show that, as has been documented before for phonemes, the realizations of words are more variable and less discriminable in IDS than in ADS. At the phonological level, we find an effect in the opposite direction: the IDS lexicon contains more distinctive words (such as onomatopoeias) than the ADS counterpart. Combining the acoustic and phonological metrics together in a global discriminability score reveals that the bigger separation of lexical categories in the phonological space does not compensate for the opposite effect observed at the acoustic level. As a result, IDS word forms are still globally less discriminable than ADS word forms, even though the effect is numerically small. We discuss the implication of these findings for the view that the functional role of IDS is to improve language learnability.

* Draft

Via

Access Paper or Ask Questions