Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jakir Hasan

BanglaIPA: Towards Robust Text-to-IPA Transcription with Contextual Rewriting in Bengali

Jan 05, 2026

Jakir Hasan, Shrestha Datta, Md Saiful Islam, Shubhashis Roy Dipta, Ameya Debnath

Abstract:Despite its widespread use, Bengali lacks a robust automated International Phonetic Alphabet (IPA) transcription system that effectively supports both standard language and regional dialectal texts. Existing approaches struggle to handle regional variations, numerical expressions, and generalize poorly to previously unseen words. To address these limitations, we propose BanglaIPA, a novel IPA generation system that integrates a character-based vocabulary with word-level alignment. The proposed system accurately handles Bengali numerals and demonstrates strong performance across regional dialects. BanglaIPA improves inference efficiency by leveraging a precomputed word-to-IPA mapping dictionary for previously observed words. The system is evaluated on the standard Bengali and six regional variations of the DUAL-IPA dataset. Experimental results show that BanglaIPA outperforms baseline IPA transcription models by 58.4-78.7% and achieves an overall mean word error rate of 11.4%, highlighting its robustness in phonetic transcription generation for the Bengali language.

Via

Access Paper or Ask Questions

G-RAG: Knowledge Expansion in Material Science

Nov 21, 2024

Radeen Mostafa, Mirza Nihal Baig, Mashaekh Tausif Ehsan, Jakir Hasan

Figure 1 for G-RAG: Knowledge Expansion in Material Science

Figure 2 for G-RAG: Knowledge Expansion in Material Science

Figure 3 for G-RAG: Knowledge Expansion in Material Science

Figure 4 for G-RAG: Knowledge Expansion in Material Science

Abstract:In the field of Material Science, effective information retrieval systems are essential for facilitating research. Traditional Retrieval-Augmented Generation (RAG) approaches in Large Language Models (LLMs) often encounter challenges such as outdated information, hallucinations, limited interpretability due to context constraints, and inaccurate retrieval. To address these issues, Graph RAG integrates graph databases to enhance the retrieval process. Our proposed method processes Material Science documents by extracting key entities (referred to as MatIDs) from sentences, which are then utilized to query external Wikipedia knowledge bases (KBs) for additional relevant information. We implement an agent-based parsing technique to achieve a more detailed representation of the documents. Our improved version of Graph RAG called G-RAG further leverages a graph database to capture relationships between these entities, improving both retrieval accuracy and contextual understanding. This enhanced approach demonstrates significant improvements in performance for domains that require precise information retrieval, such as Material Science.

Via

Access Paper or Ask Questions

Character-Level Bangla Text-to-IPA Transcription Using Transformer Architecture with Sequence Alignment

Nov 07, 2023

Jakir Hasan, Shrestha Datta, Ameya Debnath

Abstract:The International Phonetic Alphabet (IPA) is indispensable in language learning and understanding, aiding users in accurate pronunciation and comprehension. Additionally, it plays a pivotal role in speech therapy, linguistic research, accurate transliteration, and the development of text-to-speech systems, making it an essential tool across diverse fields. Bangla being 7th as one of the widely used languages, gives rise to the need for IPA in its domain. Its IPA mapping is too diverse to be captured manually giving the need for Artificial Intelligence and Machine Learning in this field. In this study, we have utilized a transformer-based sequence-to-sequence model at the letter and symbol level to get the IPA of each Bangla word as the variation of IPA in association of different words is almost null. Our transformer model only consisted of 8.5 million parameters with only a single decoder and encoder layer. Additionally, to handle the punctuation marks and the occurrence of foreign languages in the text, we have utilized manual mapping as the model won't be able to learn to separate them from Bangla words while decreasing our required computational resources. Finally, maintaining the relative position of the sentence component IPAs and generation of the combined IPA has led us to achieve the top position with a word error rate of 0.10582 in the public ranking of DataVerse Challenge - ITVerse 2023 (https://www.kaggle.com/competitions/dataverse_2023/).

* Achieved top position with a word error rate of 0.10582 in the public ranking of DataVerse Challenge - ITVerse 2023 (link: https://www.kaggle.com/competitions/dataverse_2023/). All codes can be found on the respective competition webpage

Via

Access Paper or Ask Questions