Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dr. Prathosh A. P.

Neural Compound-Word (Sandhi) Generation and Splitting in Sanskrit Language

Oct 24, 2020

Sushant Dave, Arun Kumar Singh, Dr. Prathosh A. P., Prof. Brejesh Lall

Figure 1 for Neural Compound-Word (Sandhi) Generation and Splitting in Sanskrit Language

Figure 2 for Neural Compound-Word (Sandhi) Generation and Splitting in Sanskrit Language

Figure 3 for Neural Compound-Word (Sandhi) Generation and Splitting in Sanskrit Language

Figure 4 for Neural Compound-Word (Sandhi) Generation and Splitting in Sanskrit Language

Abstract:This paper describes neural network based approaches to the process of the formation and splitting of word-compounding, respectively known as the Sandhi and Vichchhed, in Sanskrit language. Sandhi is an important idea essential to morphological analysis of Sanskrit texts. Sandhi leads to word transformations at word boundaries. The rules of Sandhi formation are well defined but complex, sometimes optional and in some cases, require knowledge about the nature of the words being compounded. Sandhi split or Vichchhed is an even more difficult task given its non uniqueness and context dependence. In this work, we propose the route of formulating the problem as a sequence to sequence prediction task, using modern deep learning techniques. Being the first fully data driven technique, we demonstrate that our model has an accuracy better than the existing methods on multiple standard datasets, despite not using any additional lexical or morphological resources. The code is being made available at https://github.com/IITD-DataScience/Sandhi_Prakarana

* 6 pages, 3 figures, CODS-COMAD 2021, IIIT Bangalore, India

Via

Access Paper or Ask Questions

A Benchmark Corpus and Neural Approach for Sanskrit Derivative Nouns Analysis

Oct 24, 2020

Arun Kumar Singh, Sushant Dave, Dr. Prathosh A. P., Prof. Brejesh Lall, Shresth Mehta

Figure 1 for A Benchmark Corpus and Neural Approach for Sanskrit Derivative Nouns Analysis

Figure 2 for A Benchmark Corpus and Neural Approach for Sanskrit Derivative Nouns Analysis

Figure 3 for A Benchmark Corpus and Neural Approach for Sanskrit Derivative Nouns Analysis

Figure 4 for A Benchmark Corpus and Neural Approach for Sanskrit Derivative Nouns Analysis

Abstract:This paper presents first benchmark corpus of Sanskrit Pratyaya (suffix) and inflectional words (padas) formed due to suffixes along with neural network based approaches to process the formation and splitting of inflectional words. Inflectional words spans the primary and secondary derivative nouns as the scope of current work. Pratyayas are an important dimension of morphological analysis of Sanskrit texts. There have been Sanskrit Computational Linguistics tools for processing and analyzing Sanskrit texts. Unfortunately there has not been any work to standardize & validate these tools specifically for derivative nouns analysis. In this work, we prepared a Sanskrit suffix benchmark corpus called Pratyaya-Kosh to evaluate the performance of tools. We also present our own neural approach for derivative nouns analysis while evaluating the same on most prominent Sanskrit Morphological Analysis tools. This benchmark will be freely dedicated and available to researchers worldwide and we hope it will motivate all to improve morphological analysis in Sanskrit Language.

* 6 pages, 2 figures, EACL 2021 Submission

Via

Access Paper or Ask Questions