Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Timothy Wong

Syllable based DNN-HMM Cantonese Speech to Text System

Feb 13, 2024

Timothy Wong, Claire Li, Sam Lam, Billy Chiu, Qin Lu, Minglei Li, Dan Xiong, Roy Shing Yu, Vincent T. Y. Ng

Abstract:This paper reports our work on building up a Cantonese Speech-to-Text (STT) system with a syllable based acoustic model. This is a part of an effort in building a STT system to aid dyslexic students who have cognitive deficiency in writing skills but have no problem expressing their ideas through speech. For Cantonese speech recognition, the basic unit of acoustic models can either be the conventional Initial-Final (IF) syllables, or the Onset-Nucleus-Coda (ONC) syllables where finals are further split into nucleus and coda to reflect the intra-syllable variations in Cantonese. By using the Kaldi toolkit, our system is trained using the stochastic gradient descent optimization model with the aid of GPUs for the hybrid Deep Neural Network and Hidden Markov Model (DNN-HMM) with and without I-vector based speaker adaptive training technique. The input features of the same Gaussian Mixture Model with speaker adaptive training (GMM-SAT) to DNN are used in all cases. Experiments show that the ONC-based syllable acoustic modeling with I-vector based DNN-HMM achieves the best performance with the word error rate (WER) of 9.66% and the real time factor (RTF) of 1.38812.

* 7 pages, 3 figures, LREC 2016

Via

Access Paper or Ask Questions

Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis

Jul 10, 2018

Timothy Wong, Zhiyuan Luo

Figure 1 for Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis

Figure 2 for Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis

Figure 3 for Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis

Figure 4 for Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis

Abstract:Recurrent auto-encoder model summarises sequential data through an encoder structure into a fixed-length vector and then reconstructs the original sequence through the decoder structure. The summarised vector can be used to represent time series features. In this paper, we propose relaxing the dimensionality of the decoder output so that it performs partial reconstruction. The fixed-length vector therefore represents features in the selected dimensions only. In addition, we propose using rolling fixed window approach to generate training samples from unbounded time series data. The change of time series features over time can be summarised as a smooth trajectory path. The fixed-length vectors are further analysed using additional visualisation and unsupervised clustering techniques. The proposed method can be applied in large-scale industrial processes for sensors signal analysis purpose, where clusters of the vector representations can reflect the operating states of the industrial system.

* E. Pimenidis and C. Jayne (Eds.): EANN 2018, CCIS 893
* Accepted paper at the 19th International Conference on Engineering Applications of Neural Networks (EANN 2018)

Via

Access Paper or Ask Questions