Abstract:Manifold ranking has been successfully applied in query-oriented multi-document summarization. It not only makes use of the relationships among the sentences, but also the relationships between the given query and the sentences. However, the information of original query is often insufficient. So we present a query expansion method, which is combined in the manifold ranking to resolve this problem. Our method not only utilizes the information of the query term itself and the knowledge base WordNet to expand it by synonyms, but also uses the information of the document set itself to expand the query in various ways (mean expansion, variance expansion and TextRank expansion). Compared with the previous query expansion methods, our method combines multiple query expansion methods to better represent query information, and at the same time, it makes a useful attempt on manifold ranking. In addition, we use the degree of word overlap and the proximity between words to calculate the similarity between sentences. We performed experiments on the datasets of DUC 2006 and DUC2007, and the evaluation results show that the proposed query expansion method can significantly improve the system performance and make our system comparable to the state-of-the-art systems.
Abstract:Due to the manifold ranking method has a significant effect on the ranking of unknown data based on known data by using a weighted network, many researchers use the manifold ranking method to solve the document summarization task. However, their models only consider the original features but ignore the semantic features of sentences when they construct the weighted networks for the manifold ranking method. To solve this problem, we proposed two improved models based on the manifold ranking method. One is combining the topic model and manifold ranking method (JTMMR) to solve the document summarization task. This model not only uses the original feature, but also uses the semantic feature to represent the document, which can improve the accuracy of the manifold ranking method. The other one is combining the lifelong topic model and manifold ranking method (JLTMMR). On the basis of the JTMMR, this model adds the constraint of knowledge to improve the quality of the topic. At the same time, we also add the constraint of the relationship between documents to dig out a better document semantic features. The JTMMR model can improve the effect of the manifold ranking method by using the better semantic feature. Experiments show that our models can achieve a better result than other baseline models for multi-document summarization task. At the same time, our models also have a good performance on the single document summarization task. After combining with a few basic surface features, our model significantly outperforms some model based on deep learning in recent years. After that, we also do an exploring work for lifelong machine learning by analyzing the effect of adding feedback. Experiments show that the effect of adding feedback to our model is significant.