Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rakesh Chandra Balabantaray

Fine-tuning Pre-trained Named Entity Recognition Models For Indian Languages

May 08, 2024

Sankalp Bahad, Pruthwik Mishra, Karunesh Arora, Rakesh Chandra Balabantaray, Dipti Misra Sharma, Parameswari Krishnamurthy

Figure 1 for Fine-tuning Pre-trained Named Entity Recognition Models For Indian Languages

Figure 2 for Fine-tuning Pre-trained Named Entity Recognition Models For Indian Languages

Figure 3 for Fine-tuning Pre-trained Named Entity Recognition Models For Indian Languages

Figure 4 for Fine-tuning Pre-trained Named Entity Recognition Models For Indian Languages

Abstract:Named Entity Recognition (NER) is a useful component in Natural Language Processing (NLP) applications. It is used in various tasks such as Machine Translation, Summarization, Information Retrieval, and Question-Answering systems. The research on NER is centered around English and some other major languages, whereas limited attention has been given to Indian languages. We analyze the challenges and propose techniques that can be tailored for Multilingual Named Entity Recognition for Indian Languages. We present a human annotated named entity corpora of 40K sentences for 4 Indian languages from two of the major Indian language families. Additionally,we present a multilingual model fine-tuned on our dataset, which achieves an F1 score of 0.80 on our dataset on average. We achieve comparable performance on completely unseen benchmark datasets for Indian languages which affirms the usability of our model.

* 8 pages, accepted in NAACL-SRW, 2024

Via

Access Paper or Ask Questions

Automatic Parallel Corpus Creation for Hindi-English News Translation Task

Jan 24, 2019

Aditya Kumar Pathak, Priyankit Acharya, Dilpreet Kaur, Rakesh Chandra Balabantaray

Figure 1 for Automatic Parallel Corpus Creation for Hindi-English News Translation Task

Figure 2 for Automatic Parallel Corpus Creation for Hindi-English News Translation Task

Figure 3 for Automatic Parallel Corpus Creation for Hindi-English News Translation Task

Figure 4 for Automatic Parallel Corpus Creation for Hindi-English News Translation Task

Abstract:The parallel corpus for multilingual NLP tasks, deep learning applications like Statistical Machine Translation Systems is very important. The parallel corpus of Hindi-English language pair available for news translation task till date is of very limited size as per the requirement of the systems are concerned. In this work we have developed an automatic parallel corpus generation system prototype, which creates Hindi-English parallel corpus for news translation task. Further to verify the quality of generated parallel corpus we have experimented by taking various performance metrics and the results are quite interesting.

Via

Access Paper or Ask Questions