Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Streaming Punctuation for Long-form Dictation with Transformers

Oct 11, 2022

Piyush Behre, Sharman Tan, Padma Varadharajan, Shuangyu Chang

Figure 1 for Streaming Punctuation for Long-form Dictation with Transformers

Figure 2 for Streaming Punctuation for Long-form Dictation with Transformers

Figure 3 for Streaming Punctuation for Long-form Dictation with Transformers

Figure 4 for Streaming Punctuation for Long-form Dictation with Transformers

Share this with someone who'll enjoy it:

Abstract:While speech recognition Word Error Rate (WER) has reached human parity for English, long-form dictation scenarios still suffer from segmentation and punctuation problems resulting from irregular pausing patterns or slow speakers. Transformer sequence tagging models are effective at capturing long bi-directional context, which is crucial for automatic punctuation. A typical Automatic Speech Recognition (ASR) production system, however, is constrained by real-time requirements, making it hard to incorporate the right context when making punctuation decisions. In this paper, we propose a streaming approach for punctuation or re-punctuation of ASR output using dynamic decoding windows and measure its impact on punctuation and segmentation accuracy in a variety of scenarios. The new system tackles over-segmentation issues, improving segmentation F0.5-score by 13.9%. Streaming punctuation achieves an average BLEU-score gain of 0.66 for the downstream task of Machine Translation (MT).

View paper on

Share this with someone who'll enjoy it:

Title:Streaming Punctuation for Long-form Dictation with Transformers

Paper and Code