Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dilek Hakkani Tür

Correcting Automated and Manual Speech Transcription Errors using Warped Language Models

Mar 26, 2021

Mahdi Namazifar, John Malik, Li Erran Li, Gokhan Tur, Dilek Hakkani Tür

Figure 1 for Correcting Automated and Manual Speech Transcription Errors using Warped Language Models

Figure 2 for Correcting Automated and Manual Speech Transcription Errors using Warped Language Models

Figure 3 for Correcting Automated and Manual Speech Transcription Errors using Warped Language Models

Figure 4 for Correcting Automated and Manual Speech Transcription Errors using Warped Language Models

Abstract:Masked language models have revolutionized natural language processing systems in the past few years. A recently introduced generalization of masked language models called warped language models are trained to be more robust to the types of errors that appear in automatic or manual transcriptions of spoken language by exposing the language model to the same types of errors during training. In this work we propose a novel approach that takes advantage of the robustness of warped language models to transcription noise for correcting transcriptions of spoken language. We show that our proposed approach is able to achieve up to 10% reduction in word error rates of both automatic and manual transcriptions of spoken language.

* Submitted to INTERSPEECH

Via

Access Paper or Ask Questions

Warped Language Models for Noise Robust Language Understanding

Nov 03, 2020

Mahdi Namazifar, Gokhan Tur, Dilek Hakkani Tür

Figure 1 for Warped Language Models for Noise Robust Language Understanding

Figure 2 for Warped Language Models for Noise Robust Language Understanding

Figure 3 for Warped Language Models for Noise Robust Language Understanding

Figure 4 for Warped Language Models for Noise Robust Language Understanding

Abstract:Masked Language Models (MLM) are self-supervised neural networks trained to fill in the blanks in a given sentence with masked tokens. Despite the tremendous success of MLMs for various text based tasks, they are not robust for spoken language understanding, especially for spontaneous conversational speech recognition noise. In this work we introduce Warped Language Models (WLM) in which input sentences at training time go through the same modifications as in MLM, plus two additional modifications, namely inserting and dropping random tokens. These two modifications extend and contract the sentence in addition to the modifications in MLMs, hence the word "warped" in the name. The insertion and drop modification of the input text during training of WLM resemble the types of noise due to Automatic Speech Recognition (ASR) errors, and as a result WLMs are likely to be more robust to ASR noise. Through computational results we show that natural language understanding systems built on top of WLMs perform better compared to those built based on MLMs, especially in the presence of ASR errors.

* To appear at IEEE SLT 2021

Via

Access Paper or Ask Questions