Picture for Anoop Kunchukuttan

Anoop Kunchukuttan

Pralekha: An Indic Document Alignment Evaluation Benchmark

Add code
Nov 28, 2024
Viaarxiv icon

BhasaAnuvaad: A Speech Translation Dataset for 14 Indian Languages

Add code
Nov 07, 2024
Figure 1 for BhasaAnuvaad: A Speech Translation Dataset for 14 Indian Languages
Figure 2 for BhasaAnuvaad: A Speech Translation Dataset for 14 Indian Languages
Figure 3 for BhasaAnuvaad: A Speech Translation Dataset for 14 Indian Languages
Figure 4 for BhasaAnuvaad: A Speech Translation Dataset for 14 Indian Languages
Viaarxiv icon

Cross-Lingual Auto Evaluation for Assessing Multilingual LLMs

Add code
Oct 17, 2024
Viaarxiv icon

An Empirical Comparison of Vocabulary Expansion and Initialization Approaches for Language Models

Add code
Jul 08, 2024
Viaarxiv icon

How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages?

Add code
Jun 06, 2024
Viaarxiv icon

Synthetic Data Generation and Joint Learning for Robust Code-Mixed Translation

Add code
Mar 25, 2024
Viaarxiv icon

IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages

Add code
Mar 11, 2024
Viaarxiv icon

Airavata: Introducing Hindi Instruction-tuned LLM

Add code
Jan 26, 2024
Viaarxiv icon

RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models models via Romanization

Add code
Jan 25, 2024
Viaarxiv icon

Bhasha-Abhijnaanam: Native-script and romanized Language Identification for 22 Indic languages

Add code
May 25, 2023
Viaarxiv icon