Abstract:Entity synonyms discovery is crucial for entity-leveraging applications. However, existing studies suffer from several critical issues: (1) the input mentions may be out-of-vocabulary (OOV) and may come from a different semantic space of the entities; (2) the connection between mentions and entities may be hidden and cannot be established by surface matching; and (3) some entities rarely appear due to the long-tail effect. To tackle these challenges, we facilitate knowledge graphs and propose a novel entity synonyms discovery framework, named \emph{KGSynNet}. Specifically, we pre-train subword embeddings for mentions and entities using a large-scale domain-specific corpus while learning the knowledge embeddings of entities via a joint TransC-TransE model. More importantly, to obtain a comprehensive representation of entities, we employ a specifically designed \emph{fusion gate} to adaptively absorb the entities' knowledge information into their semantic features. We conduct extensive experiments to demonstrate the effectiveness of our \emph{KGSynNet} in leveraging the knowledge graph. The experimental results show that the \emph{KGSynNet} improves the state-of-the-art methods by 14.7\% in terms of hits@3 in the offline evaluation and outperforms the BERT model by 8.3\% in the positive feedback rate of an online A/B test on the entity linking module of a question answering system.
Abstract:This paper describes our approach in DSTC 8 Track 4: Schema-Guided Dialogue State Tracking. The goal of this task is to predict the intents and slots in each user turn to complete the dialogue state tracking (DST) based on the information provided by the task's schema. Different from traditional stage-wise DST, we propose an end-to-end DST system to avoid error accumulation between the dialogue turns. The DST system consists of a machine reading comprehension (MRC) model for non-categorical slots and a Wide & Deep model for categorical slots. As far as we know, this is the first time that MRC and Wide & Deep model are applied to DST problem in a fully end-to-end way. Experimental results show that our framework achieves an excellent performance on the test dataset including 50% zero-shot services with a joint goal accuracy of 0.8652 and a slot tagging F1-Score of 0.9835.
Abstract:For the task of open domain Knowledge Based Question Answering in CCKS2019, we propose a method combining information retrieval and semantic parsing. This multi-module system extracts the topic entity and the most related relation predicate from a question and transforms it into a Sparql query statement. Our method obtained the F1 score of 70.45% on the test data.