Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Robert Frederking

Transition-Based Dependency Parsing using Perceptron Learner

Jan 28, 2020

Rahul Radhakrishnan Iyer, Miguel Ballesteros, Chris Dyer, Robert Frederking

Figure 1 for Transition-Based Dependency Parsing using Perceptron Learner

Figure 2 for Transition-Based Dependency Parsing using Perceptron Learner

Figure 3 for Transition-Based Dependency Parsing using Perceptron Learner

Figure 4 for Transition-Based Dependency Parsing using Perceptron Learner

Abstract:Syntactic parsing using dependency structures has become a standard technique in natural language processing with many different parsing models, in particular data-driven models that can be trained on syntactically annotated corpora. In this paper, we tackle transition-based dependency parsing using a Perceptron Learner. Our proposed model, which adds more relevant features to the Perceptron Learner, outperforms a baseline arc-standard parser. We beat the UAS of the MALT and LSTM parsers. We also give possible ways to address parsing of non-projective trees.

* This was part of an assignment at my graduate course at LTI. This does not offer any major novelties

Via

Access Paper or Ask Questions

RWR-GAE: Random Walk Regularization for Graph Auto Encoders

Aug 12, 2019

Vaibhav, Po-Yao Huang, Robert Frederking

Figure 1 for RWR-GAE: Random Walk Regularization for Graph Auto Encoders

Figure 2 for RWR-GAE: Random Walk Regularization for Graph Auto Encoders

Figure 3 for RWR-GAE: Random Walk Regularization for Graph Auto Encoders

Figure 4 for RWR-GAE: Random Walk Regularization for Graph Auto Encoders

Abstract:Node embeddings have become an ubiquitous technique for representing graph data in a low dimensional space. Graph autoencoders, as one of the widely adapted deep models, have been proposed to learn graph embeddings in an unsupervised way by minimizing the reconstruction error for the graph data. However, its reconstruction loss ignores the distribution of the latent representation, and thus leading to inferior embeddings. To mitigate this problem, we propose a random walk based method to regularize the representations learnt by the encoder. We show that the proposed novel enhancement beats the existing state-of-the-art models by a large margin (upto 7.5\%) for node clustering task, and achieves state-of-the-art accuracy on the link prediction task for three standard datasets, cora, citeseer and pubmed. Code available at https://github.com/MysteryVaibhav/DW-GAE.

* 6 pages, Empirical paper on improving Graph Embeddings using Random Walk

Via

Access Paper or Ask Questions

Supervised Topical Key Phrase Extraction of News Stories using Crowdsourcing, Light Filtering and Co-reference Normalization

Jun 20, 2013

Luis Marujo, Anatole Gershman, Jaime Carbonell, Robert Frederking, João P. Neto

Figure 1 for Supervised Topical Key Phrase Extraction of News Stories using Crowdsourcing, Light Filtering and Co-reference Normalization

Figure 2 for Supervised Topical Key Phrase Extraction of News Stories using Crowdsourcing, Light Filtering and Co-reference Normalization

Abstract:Fast and effective automated indexing is critical for search and personalized services. Key phrases that consist of one or more words and represent the main concepts of the document are often used for the purpose of indexing. In this paper, we investigate the use of additional semantic features and pre-processing steps to improve automatic key phrase extraction. These features include the use of signal words and freebase categories. Some of these features lead to significant improvements in the accuracy of the results. We also experimented with 2 forms of document pre-processing that we call light filtering and co-reference normalization. Light filtering removes sentences from the document, which are judged peripheral to its main content. Co-reference normalization unifies several written forms of the same named entity into a unique form. We also needed a "Gold Standard" - a set of labeled documents for training and evaluation. While the subjective nature of key phrase selection precludes a true "Gold Standard", we used Amazon's Mechanical Turk service to obtain a useful approximation. Our data indicates that the biggest improvements in performance were due to shallow semantic features, news categories, and rhetorical signals (nDCG 78.47% vs. 68.93%). The inclusion of deeper semantic features such as Freebase sub-categories was not beneficial by itself, but in combination with pre-processing, did cause slight improvements in the nDCG scores.

* In 8th International Conference on Language Resources and Evaluation (LREC 2012)

Via

Access Paper or Ask Questions