Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jennifer Cole

Crowdsourced and Automatic Speech Prominence Estimation

Oct 12, 2023

Max Morrison, Pranav Pawar, Nathan Pruyne, Jennifer Cole, Bryan Pardo

Abstract:The prominence of a spoken word is the degree to which an average native listener perceives the word as salient or emphasized relative to its context. Speech prominence estimation is the process of assigning a numeric value to the prominence of each word in an utterance. These prominence labels are useful for linguistic analysis, as well as training automated systems to perform emphasis-controlled text-to-speech or emotion recognition. Manually annotating prominence is time-consuming and expensive, which motivates the development of automated methods for speech prominence estimation. However, developing such an automated system using machine-learning methods requires human-annotated training data. Using our system for acquiring such human annotations, we collect and open-source crowdsourced annotations of a portion of the LibriTTS dataset. We use these annotations as ground truth to train a neural speech prominence estimator that generalizes to unseen speakers, datasets, and speaking styles. We investigate design decisions for neural prominence estimation as well as how neural prominence estimation improves as a function of two key factors of annotation cost: dataset size and the number of annotations per utterance.

* Submitted to ICASSP 2024

Via

Access Paper or Ask Questions

Prosodic entrainment in dialog acts

Oct 30, 2018

Uwe D. Reichel, Katalin Mády, Jennifer Cole

Figure 1 for Prosodic entrainment in dialog acts

Figure 2 for Prosodic entrainment in dialog acts

Figure 3 for Prosodic entrainment in dialog acts

Figure 4 for Prosodic entrainment in dialog acts

Abstract:We examined prosodic entrainment in spoken dialogs separately for several dialog acts in cooperative and competitive games. Entrainment was measured for intonation features derived from a superpositional intonation stylization as well as for rhythm features. The found differences can be related to the cooperative or competitive nature of the game, as well as to dialog act properties as its intrinsic authority, supportiveness and distributional characteristics. In cooperative games dialog acts with a high authority given by knowledge and with a high frequency showed the most entrainment. The results are discussed amongst others with respect to the degree of active entrainment control in cooperative behavior.

* This manuscript is under revision. Please contact the authors for information about updates

Via

Access Paper or Ask Questions