Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Akio Kobayashi

ExKaldi-RT: A Real-Time Automatic Speech Recognition Extension Toolkit of Kaldi

Apr 03, 2021

Yu Wang, Chee Siang Leow, Akio Kobayashi, Takehito Utsuro, Hiromitsu Nishizaki

Figure 1 for ExKaldi-RT: A Real-Time Automatic Speech Recognition Extension Toolkit of Kaldi

Figure 2 for ExKaldi-RT: A Real-Time Automatic Speech Recognition Extension Toolkit of Kaldi

Figure 3 for ExKaldi-RT: A Real-Time Automatic Speech Recognition Extension Toolkit of Kaldi

Figure 4 for ExKaldi-RT: A Real-Time Automatic Speech Recognition Extension Toolkit of Kaldi

Abstract:The availability of open-source software is playing a remarkable role in automatic speech recognition (ASR). Kaldi, for instance, is widely used to develop state-of-the-art offline and online ASR systems. This paper describes the "ExKaldi-RT," online ASR toolkit implemented based on Kaldi and Python language. ExKaldi-RT provides tools for providing a real-time audio stream pipeline, extracting acoustic features, transmitting packets with a remote connection, estimating acoustic probabilities with a neural network, and online decoding. While similar functions are available built on Kaldi, a key feature of ExKaldi-RT is completely working on Python language, which has an easy-to-use interface for online ASR system developers to exploit original research, for example, by applying neural network-based signal processing and acoustic model trained with deep learning frameworks. We performed benchmark experiments on the minimum LibriSpeech corpus, and showed that ExKaldi-RT could achieve competitive ASR performance in real-time.

* Submitted to INTERSPEECH2021

Via

Access Paper or Ask Questions