Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Joost van Doremalen

Calibration of Phone Likelihoods in Automatic Speech Recognition

Jun 14, 2016

David A. van Leeuwen, Joost van Doremalen

Figure 1 for Calibration of Phone Likelihoods in Automatic Speech Recognition

Figure 2 for Calibration of Phone Likelihoods in Automatic Speech Recognition

Figure 3 for Calibration of Phone Likelihoods in Automatic Speech Recognition

Figure 4 for Calibration of Phone Likelihoods in Automatic Speech Recognition

Abstract:In this paper we study the probabilistic properties of the posteriors in a speech recognition system that uses a deep neural network (DNN) for acoustic modeling. We do this by reducing Kaldi's DNN shared pdf-id posteriors to phone likelihoods, and using test set forced alignments to evaluate these using a calibration sensitive metric. Individual frame posteriors are in principle well-calibrated, because the DNN is trained using cross entropy as the objective function, which is a proper scoring rule. When entire phones are assessed, we observe that it is best to average the log likelihoods over the duration of the phone. Further scaling of the average log likelihoods by the logarithm of the duration slightly improves the calibration, and this improvement is retained when tested on independent test data.

* Rejected by Interspeech 2016. I would love to include the reviews, but there is no space for that here (400 characters)

Via

Access Paper or Ask Questions