Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Emily Cibelli

Automatic measurement of vowel duration via structured prediction

Oct 26, 2016

Yossi Adi, Joseph Keshet, Emily Cibelli, Erin Gustafson, Cynthia Clopper, Matthew Goldrick

Figure 1 for Automatic measurement of vowel duration via structured prediction

Figure 2 for Automatic measurement of vowel duration via structured prediction

Figure 3 for Automatic measurement of vowel duration via structured prediction

Figure 4 for Automatic measurement of vowel duration via structured prediction

Abstract:A key barrier to making phonetic studies scalable and replicable is the need to rely on subjective, manual annotation. To help meet this challenge, a machine learning algorithm was developed for automatic measurement of a widely used phonetic measure: vowel duration. Manually-annotated data were used to train a model that takes as input an arbitrary length segment of the acoustic signal containing a single vowel that is preceded and followed by consonants and outputs the duration of the vowel. The model is based on the structured prediction framework. The input signal and a hypothesized set of a vowel's onset and offset are mapped to an abstract vector space by a set of acoustic feature functions. The learning algorithm is trained in this space to minimize the difference in expectations between predicted and manually-measured vowel durations. The trained model can then automatically estimate vowel durations without phonetic or orthographic transcription. Results comparing the model to three sets of manually annotated data suggest it out-performed the current gold standard for duration measurement, an HMM-based forced aligner (which requires orthographic or phonetic transcription as an input).

Via

Access Paper or Ask Questions

Sequence Segmentation Using Joint RNN and Structured Prediction Models

Oct 25, 2016

Yossi Adi, Joseph Keshet, Emily Cibelli, Matthew Goldrick

Figure 1 for Sequence Segmentation Using Joint RNN and Structured Prediction Models

Figure 2 for Sequence Segmentation Using Joint RNN and Structured Prediction Models

Figure 3 for Sequence Segmentation Using Joint RNN and Structured Prediction Models

Figure 4 for Sequence Segmentation Using Joint RNN and Structured Prediction Models

Abstract:We describe and analyze a simple and effective algorithm for sequence segmentation applied to speech processing tasks. We propose a neural architecture that is composed of two modules trained jointly: a recurrent neural network (RNN) module and a structured prediction model. The RNN outputs are considered as feature functions to the structured model. The overall model is trained with a structured loss function which can be designed to the given segmentation task. We demonstrate the effectiveness of our method by applying it to two simple tasks commonly used in phonetic studies: word segmentation and voice onset time segmentation. Results sug- gest the proposed model is superior to previous methods, ob- taining state-of-the-art results on the tested datasets.

* under review

Via

Access Paper or Ask Questions