Abstract:This paper presents and analyzes an incremental algorithm for the construction of Acyclic Non-deterministic Finite-state Automata (NFA). Automata of this type are quite useful in computational linguistics, especially for storing lexicons. The proposed algorithm produces compact NFAs, i.e. NFAs that do not contain equivalent states. Unlike Deterministic Finite-state Automata (DFA), this property is not sufficient to ensure minimality, but still the resulting NFAs are considerably smaller than the minimal DFAs for the same languages.
Abstract:In this paper we present a lexicon-based approach to the problem of morphological processing. Full-form words, lemmas and grammatical tags are interconnected in a DAWG. Thus, the process of analysis/synthesis is reduced to a search in the graph, which is very fast and can be performed even if several pieces of information are missing from the input. The contents of the DAWG are updated using an on-line incremental process. The proposed approach is language independent and it does not utilize any morphophonetic rules or any other special linguistic information.