Abstract:Hand-engineered feature sets are a well understood method for creating robust NLP models, but they require a lot of expertise and effort to create. In this work we describe how to automatically generate rich feature sets from simple units called featlets, requiring less engineering. Using information gain to guide the generation process, we train models which rival the state of the art on two standard Semantic Role Labeling datasets with almost no task or linguistic insight.
Abstract:Most work on building knowledge bases has focused on collecting entities and facts from as large a collection of documents as possible. We argue for and describe a new paradigm where the focus is on a high-recall extraction over a small collection of documents under the supervision of a human expert, that we call Interactive Knowledge Base Population (IKBP).