Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shu Rong

PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution

Oct 23, 2018

Hong Chen, Zhenhua Fan, Hao Lu, Alan L. Yuille, Shu Rong

Figure 1 for PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution

Figure 2 for PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution

Figure 3 for PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution

Figure 4 for PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution

Abstract:We introduce PreCo, a large-scale English dataset for coreference resolution. The dataset is designed to embody the core challenges in coreference, such as entity representation, by alleviating the challenge of low overlap between training and test sets and enabling separated analysis of mention detection and mention clustering. To strengthen the training-test overlap, we collect a large corpus of about 38K documents and 12.4M words which are mostly from the vocabulary of English-speaking preschoolers. Experiments show that with higher training-test overlap, error analysis on PreCo is more efficient than the one on OntoNotes, a popular existing dataset. Furthermore, we annotate singleton mentions making it possible for the first time to quantify the influence that a mention detector makes on coreference resolution performance. The dataset is freely available at https://preschool-lab.github.io/PreCo/.

* EMNLP 2018

Via

Access Paper or Ask Questions

AM-GAN: Improved Usage of Class-Labels in Generative Adversarial Nets

Jul 11, 2018

Zhiming Zhou, Han Cai, Shu Rong, Yuxuan Song, Kan Ren, Weinan Zhang, Yong Yu, Jun Wang

Figure 1 for AM-GAN: Improved Usage of Class-Labels in Generative Adversarial Nets

Figure 2 for AM-GAN: Improved Usage of Class-Labels in Generative Adversarial Nets

Figure 3 for AM-GAN: Improved Usage of Class-Labels in Generative Adversarial Nets

Figure 4 for AM-GAN: Improved Usage of Class-Labels in Generative Adversarial Nets

Abstract:Class labels have been empirically shown useful in improving the sample quality of generative adversarial nets (GANs). In this paper, we mathematically study the properties of the current variants of GANs that make use of class label information. With class aware gradient and cross-entropy decomposition, we reveal how class labels and associated losses influence GAN's training. Based on that, we propose Activation Maximization Generative Adversarial Networks (AM-GAN) as an advanced solution. Comprehensive experiments have been conducted to validate our analysis and evaluate the effectiveness of our solution, where AM-GAN outperforms other strong baselines and achieves state-of-the-art Inception Score (8.91) on CIFAR-10. In addition, we demonstrate that, with the Inception ImageNet classifier, Inception Score mainly tracks the diversity of the generator, and there is, however, no reliable evidence that it can reflect the true sample quality. We thus propose a new metric, called AM Score, to provide a more accurate estimation of the sample quality. Our proposed model also outperforms the baseline methods in the new metric.

* Accepted as a conference paper on ICLR 2018

Via

Access Paper or Ask Questions

QA4IE: A Question Answering based Framework for Information Extraction

Apr 10, 2018

Lin Qiu, Hao Zhou, Yanru Qu, Weinan Zhang, Suoheng Li, Shu Rong, Dongyu Ru, Lihua Qian, Kewei Tu, Yong Yu

Figure 1 for QA4IE: A Question Answering based Framework for Information Extraction

Figure 2 for QA4IE: A Question Answering based Framework for Information Extraction

Figure 3 for QA4IE: A Question Answering based Framework for Information Extraction

Figure 4 for QA4IE: A Question Answering based Framework for Information Extraction

Abstract:Information Extraction (IE) refers to automatically extracting structured relation tuples from unstructured texts. Common IE solutions, including Relation Extraction (RE) and open IE systems, can hardly handle cross-sentence tuples, and are severely restricted by limited relation types as well as informal relation specifications (e.g., free-text based relation tuples). In order to overcome these weaknesses, we propose a novel IE framework named QA4IE, which leverages the flexible question answering (QA) approaches to produce high quality relation triples across sentences. Based on the framework, we develop a large IE benchmark with high quality human evaluation. This benchmark contains 293K documents, 2M golden relation triples, and 636 relation types. We compare our system with some IE baselines on our benchmark and the results show that our system achieves great improvements.

Via

Access Paper or Ask Questions