Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Aarne Ranta

Urdu Morphology, Orthography and Lexicon Extraction

Apr 06, 2022

Muhammad Humayoun, Harald Hammarström, Aarne Ranta

Figure 1 for Urdu Morphology, Orthography and Lexicon Extraction

Figure 2 for Urdu Morphology, Orthography and Lexicon Extraction

Figure 3 for Urdu Morphology, Orthography and Lexicon Extraction

Figure 4 for Urdu Morphology, Orthography and Lexicon Extraction

Abstract:Urdu is a challenging language because of, first, its Perso-Arabic script and second, its morphological system having inherent grammatical forms and vocabulary of Arabic, Persian and the native languages of South Asia. This paper describes an implementation of the Urdu language as a software API, and we deal with orthography, morphology and the extraction of the lexicon. The morphology is implemented in a toolkit called Functional Morphology (Forsberg & Ranta, 2004), which is based on the idea of dealing grammars as software libraries. Therefore this implementation could be reused in applications such as intelligent search of keywords, language training and infrastructure for syntax. We also present an implementation of a small part of Urdu syntax to demonstrate this reusability.

* Published in CAASL-2: The Second Workshop on Computational Approaches to Arabic Script-based Languages, July 21-22, 2007, LSA 2007 Linguistic Institute, Stanford University

Via

Access Paper or Ask Questions

Embedded Controlled Languages

Jun 16, 2014

Aarne Ranta

Figure 1 for Embedded Controlled Languages

Abstract:Inspired by embedded programming languages, an embedded CNL (controlled natural language) is a proper fragment of an entire natural language (its host language), but it has a parser that recognizes the entire host language. This makes it possible to process out-of-CNL input and give useful feedback to users, instead of just reporting syntax errors. This extended abstract explains the main concepts of embedded CNL implementation in GF (Grammatical Framework), with examples from machine translation and some other ongoing work.

* 7 pages, extended abstract, preprint for CNL 2014 in Galway

Via

Access Paper or Ask Questions