Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

M Sasikumar

Machine Translation in Indian Languages: Challenges and Resolution

Aug 01, 2018

Raj Nath Patel, Prakash B. Pimpale, M Sasikumar

Figure 1 for Machine Translation in Indian Languages: Challenges and Resolution

Figure 2 for Machine Translation in Indian Languages: Challenges and Resolution

Figure 3 for Machine Translation in Indian Languages: Challenges and Resolution

Figure 4 for Machine Translation in Indian Languages: Challenges and Resolution

Abstract:English to Indian language machine translation poses the challenge of structural and morphological divergence. This paper describes English to Indian language statistical machine translation using pre-ordering and suffix separation. The pre-ordering uses rules to transfer the structure of the source sentences prior to training and translation. This syntactic restructuring helps statistical machine translation to tackle the structural divergence and hence better translation quality. The suffix separation is used to tackle the morphological divergence between English and highly agglutinative Indian languages. We demonstrate that the use of pre-ordering and suffix separation helps in improving the quality of English to Indian Language machine translation.

* 11 pages journal paper

Via

Access Paper or Ask Questions

Recurrent Neural Network based Part-of-Speech Tagger for Code-Mixed Social Media Text

Nov 16, 2016

Raj Nath Patel, Prakash B. Pimpale, M Sasikumar

Figure 1 for Recurrent Neural Network based Part-of-Speech Tagger for Code-Mixed Social Media Text

Figure 2 for Recurrent Neural Network based Part-of-Speech Tagger for Code-Mixed Social Media Text

Abstract:This paper describes Centre for Development of Advanced Computing's (CDACM) submission to the shared task-'Tool Contest on POS tagging for Code-Mixed Indian Social Media (Facebook, Twitter, and Whatsapp) Text', collocated with ICON-2016. The shared task was to predict Part of Speech (POS) tag at word level for a given text. The code-mixed text is generated mostly on social media by multilingual users. The presence of the multilingual words, transliterations, and spelling variations make such content linguistically complex. In this paper, we propose an approach to POS tag code-mixed social media text using Recurrent Neural Network Language Model (RNN-LM) architecture. We submitted the results for Hindi-English (hi-en), Bengali-English (bn-en), and Telugu-English (te-en) code-mixed data.

* In Proceedings of the Tool Contest on POS tagging for Indian Social Media Text, ICON 2016
* 7 pages, Published at the Tool Contest on POS tagging for Indian Social Media Text, ICON 2016

Via

Access Paper or Ask Questions