Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Prashant Kukde

Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech

Jun 01, 2023

Shashi Kant Gupta, Sushant Hiray, Prashant Kukde

Figure 1 for Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech

Figure 2 for Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech

Figure 3 for Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech

Figure 4 for Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech

Abstract:This work focuses on improving the Spoken Language Identification (LangId) system for a challenge that focuses on developing robust language identification systems that are reliable for non-standard, accented (Singaporean accent), spontaneous code-switched, and child-directed speech collected via Zoom. We propose a two-stage Encoder-Decoder-based E2E model. The encoder module consists of 1D depth-wise separable convolutions with Squeeze-and-Excitation (SE) layers with a global context. The decoder module uses an attentive temporal pooling mechanism to get fixed length time-independent feature representation. The total number of parameters in the model is around 22.1 M, which is relatively light compared to using some large-scale pre-trained speech models. We achieved an EER of 15.6% in the closed track and 11.1% in the open track (baseline system 22.1%). We also curated additional LangId data from YouTube videos (having Singaporean speakers), which will be released for public use.

* Accepted by Interspeech 2023, 5 pages, 1 figure, 4 tables

Via

Access Paper or Ask Questions