Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fernando C. N. Pereira

AT&T Research

SLING: A framework for frame semantic parsing

Oct 19, 2017

Michael Ringgaard, Rahul Gupta, Fernando C. N. Pereira

Figure 1 for SLING: A framework for frame semantic parsing

Figure 2 for SLING: A framework for frame semantic parsing

Figure 3 for SLING: A framework for frame semantic parsing

Figure 4 for SLING: A framework for frame semantic parsing

Abstract:We describe SLING, a framework for parsing natural language into semantic frames. SLING supports general transition-based, neural-network parsing with bidirectional LSTM input encoding and a Transition Based Recurrent Unit (TBRU) for output decoding. The parsing model is trained end-to-end using only the text tokens as input. The transition system has been designed to output frame graphs directly without any intervening symbolic representation. The SLING framework includes an efficient and scalable frame store implementation as well as a neural network JIT compiler for fast inference during parsing. SLING is implemented in C++ and it is available for download on GitHub.

Via

Access Paper or Ask Questions

Similarity-Based Models of Word Cooccurrence Probabilities

Sep 27, 1998

Ido Dagan, Lillian Lee, Fernando C. N. Pereira

Figure 1 for Similarity-Based Models of Word Cooccurrence Probabilities

Figure 2 for Similarity-Based Models of Word Cooccurrence Probabilities

Figure 3 for Similarity-Based Models of Word Cooccurrence Probabilities

Figure 4 for Similarity-Based Models of Word Cooccurrence Probabilities

Abstract:In many applications of natural language processing (NLP) it is necessary to determine the likelihood of a given word combination. For example, a speech recognizer may need to determine which of the two word combinations ``eat a peach'' and ``eat a beach'' is more likely. Statistical NLP methods determine the likelihood of a word combination from its frequency in a training corpus. However, the nature of language is such that many word combinations are infrequent and do not occur in any given corpus. In this work we propose a method for estimating the probability of such previously unseen word combinations using available information on ``most similar'' words. We describe probabilistic word association models based on distributional word similarity, and apply them to two tasks, language modeling and pseudo-word disambiguation. In the language modeling task, a similarity-based model is used to improve probability estimates for unseen bigrams in a back-off language model. The similarity-based method yields a 20% perplexity improvement in the prediction of unseen bigrams and statistically significant reductions in speech-recognition error. We also compare four similarity-based estimation methods against back-off and maximum-likelihood estimation methods on a pseudo-word sense disambiguation task in which we controlled for both unigram and bigram frequency to avoid giving too much weight to easy-to-disambiguate high-frequency configurations. The similarity-based methods perform up to 40% better on this particular task.

* Machine Learning, 34, 43-69 (1999)
* 26 pages, 5 figures

Via

Access Paper or Ask Questions

Beyond Word N-Grams

Jul 13, 1996

Fernando C. N. Pereira, Yoram Singer, Naftali Tishby

Abstract:We describe, analyze, and evaluate experimentally a new probabilistic model for word-sequence prediction in natural language based on prediction suffix trees (PSTs). By using efficient data structures, we extend the notion of PST to unbounded vocabularies. We also show how to use a Bayesian approach based on recursive priors over all possible PSTs to efficiently maintain tree mixtures. These mixtures have provably and practically better performance than almost any single model. We evaluate the model on several corpora. The low perplexity achieved by relatively small PST mixture models suggests that they may be an advantageous alternative, both theoretically and practically, to the widely used n-gram models.

* 15 pages, one PostScript figure, uses psfig.sty and fullname.sty. Revised version of a paper in the Proceedings of the Third Workshop on Very Large Corpora, MIT, 1995

Via

Access Paper or Ask Questions

Finite-State Approximation of Phrase-Structure Grammars

Mar 08, 1996

Fernando C. N. Pereira, Rebecca N. Wright

Figure 1 for Finite-State Approximation of Phrase-Structure Grammars

Figure 2 for Finite-State Approximation of Phrase-Structure Grammars

Figure 3 for Finite-State Approximation of Phrase-Structure Grammars

Figure 4 for Finite-State Approximation of Phrase-Structure Grammars

Abstract:Phrase-structure grammars are effective models for important syntactic and semantic aspects of natural languages, but can be computationally too demanding for use as language models in real-time speech recognition. Therefore, finite-state models are used instead, even though they lack expressive power. To reconcile those two alternatives, we designed an algorithm to compute finite-state approximations of context-free grammars and context-free-equivalent augmented phrase-structure grammars. The approximation is exact for certain context-free grammars generating regular languages, including all left-linear and right-linear context-free grammars. The algorithm has been used to build finite-state language models for limited-domain speech recognition tasks.

* 24 pages, uses psfig.sty; revised and extended version of the 1991 ACL meeting paper with the same title

Via

Access Paper or Ask Questions

Speech Recognition by Composition of Weighted Finite Automata

Mar 07, 1996

Fernando C. N. Pereira, Michael D. Riley

Figure 1 for Speech Recognition by Composition of Weighted Finite Automata

Figure 2 for Speech Recognition by Composition of Weighted Finite Automata

Figure 3 for Speech Recognition by Composition of Weighted Finite Automata

Figure 4 for Speech Recognition by Composition of Weighted Finite Automata

Abstract:We present a general framework based on weighted finite automata and weighted finite-state transducers for describing and implementing speech recognizers. The framework allows us to represent uniformly the information sources and data structures used in recognition, including context-dependent units, pronunciation dictionaries, language models and lattices. Furthermore, general but efficient algorithms can used for combining information sources in actual recognizers and for optimizing their application. In particular, a single composition algorithm is used both to combine in advance information sources such as language models and dictionaries, and to combine acoustic observations and information sources dynamically during recognition.

* 24 pages, uses psfig.sty

Via

Access Paper or Ask Questions

Ellipsis and Higher-Order Unification

Mar 08, 1995

Mary Dalrymple, Stuart M. Shieber, Fernando C. N. Pereira

Abstract:We present a new method for characterizing the interpretive possibilities generated by elliptical constructions in natural language. Unlike previous analyses, which postulate ambiguity of interpretation or derivation in the full clause source of the ellipsis, our analysis requires no such hidden ambiguity. Further, the analysis follows relatively directly from an abstract statement of the ellipsis interpretation problem. It predicts correctly a wide range of interactions between ellipsis and other semantic phenomena such as quantifier scope and bound anaphora. Finally, although the analysis itself is stated nonprocedurally, it admits of a direct computational method for generating interpretations.

* Linguistics and Philosophy 14(4):399-452
* 54 pages

Via

Access Paper or Ask Questions

Principles and Implementation of Deductive Parsing

Apr 26, 1994

Stuart M. Shieber, Yves Schabes, Fernando C. N. Pereira

Figure 1 for Principles and Implementation of Deductive Parsing

Figure 2 for Principles and Implementation of Deductive Parsing

Figure 3 for Principles and Implementation of Deductive Parsing

Figure 4 for Principles and Implementation of Deductive Parsing

Abstract:We present a system for generating parsers based directly on the metaphor of parsing as deduction. Parsing algorithms can be represented directly as deduction systems, and a single deduction engine can interpret such deduction systems so as to implement the corresponding parser. The method generalizes easily to parsers for augmented phrase structure formalisms, such as definite-clause grammars and other logic grammar formalisms, and has been used for rapid prototyping of parsing algorithms for a variety of formalisms including variants of tree-adjoining grammars, categorial grammars, and lexicalized context-free grammars.

* 69 pages, includes full Prolog code

Via

Access Paper or Ask Questions