Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shira Calamaro

Weakly-supervised word-level pronunciation error detection in non-native English speech

Jun 07, 2021

Daniel Korzekwa, Jaime Lorenzo-Trueba, Thomas Drugman, Shira Calamaro, Bozena Kostek

Figure 1 for Weakly-supervised word-level pronunciation error detection in non-native English speech

Figure 2 for Weakly-supervised word-level pronunciation error detection in non-native English speech

Figure 3 for Weakly-supervised word-level pronunciation error detection in non-native English speech

Figure 4 for Weakly-supervised word-level pronunciation error detection in non-native English speech

Abstract:We propose a weakly-supervised model for word-level mispronunciation detection in non-native (L2) English speech. To train this model, phonetically transcribed L2 speech is not required and we only need to mark mispronounced words. The lack of phonetic transcriptions for L2 speech means that the model has to learn only from a weak signal of word-level mispronunciations. Because of that and due to the limited amount of mispronounced L2 speech, the model is more likely to overfit. To limit this risk, we train it in a multi-task setup. In the first task, we estimate the probabilities of word-level mispronunciation. For the second task, we use a phoneme recognizer trained on phonetically transcribed L1 speech that is easily accessible and can be automatically annotated. Compared to state-of-the-art approaches, we improve the accuracy of detecting word-level pronunciation errors in AUC metric by 30% on the GUT Isle Corpus of L2 Polish speakers, and by 21.5% on the Isle Corpus of L2 German and Italian speakers.

* Accepted to Interspeech 2021

Via

Access Paper or Ask Questions

Mispronunciation Detection in Non-native (L2) English with Uncertainty Modeling

Feb 08, 2021

Daniel Korzekwa, Jaime Lorenzo-Trueba, Szymon Zaporowski, Shira Calamaro, Thomas Drugman, Bozena Kostek

Figure 1 for Mispronunciation Detection in Non-native (L2) English with Uncertainty Modeling

Figure 2 for Mispronunciation Detection in Non-native (L2) English with Uncertainty Modeling

Figure 3 for Mispronunciation Detection in Non-native (L2) English with Uncertainty Modeling

Figure 4 for Mispronunciation Detection in Non-native (L2) English with Uncertainty Modeling

Abstract:A common approach to the automatic detection of mispronunciation in language learning is to recognize the phonemes produced by a student and compare it to the expected pronunciation of a native speaker. This approach makes two simplifying assumptions: a) phonemes can be recognized from speech with high accuracy, b) there is a single correct way for a sentence to be pronounced. These assumptions do not always hold, which can result in a significant amount of false mispronunciation alarms. We propose a novel approach to overcome this problem based on two principles: a) taking into account uncertainty in the automatic phoneme recognition step, b) accounting for the fact that there may be multiple valid pronunciations. We evaluate the model on non-native (L2) English speech of German, Italian and Polish speakers, where it is shown to increase the precision of detecting mispronunciations by up to 18% (relative) compared to the common approach.

* Accepted to ICASSP 2021

Via

Access Paper or Ask Questions