Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:End-To-End Speech Recognition Using A High Rank LSTM-CTC Based Model

Mar 12, 2019

Yangyang Shi, Mei-Yuh Hwang, Xin Lei

Figure 1 for End-To-End Speech Recognition Using A High Rank LSTM-CTC Based Model

Figure 2 for End-To-End Speech Recognition Using A High Rank LSTM-CTC Based Model

Figure 3 for End-To-End Speech Recognition Using A High Rank LSTM-CTC Based Model

Share this with someone who'll enjoy it:

Abstract:Long Short Term Memory Connectionist Temporal Classification (LSTM-CTC) based end-to-end models are widely used in speech recognition due to its simplicity in training and efficiency in decoding. In conventional LSTM-CTC based models, a bottleneck projection matrix maps the hidden feature vectors obtained from LSTM to softmax output layer. In this paper, we propose to use a high rank projection layer to replace the projection matrix. The output from the high rank projection layer is a weighted combination of vectors that are projected from the hidden feature vectors via different projection matrices and non-linear activation function. The high rank projection layer is able to improve the expressiveness of LSTM-CTC models. The experimental results show that on Wall Street Journal (WSJ) corpus and LibriSpeech data set, the proposed method achieves 4%-6% relative word error rate (WER) reduction over the baseline CTC system. They outperform other published CTC based end-to-end (E2E) models under the condition that no external data or data augmentation is applied. Code has been made available at https://github.com/mobvoi/lstm_ctc.

* ICASSP 2019

View paper on

Share this with someone who'll enjoy it:

Title:End-To-End Speech Recognition Using A High Rank LSTM-CTC Based Model

Paper and Code