Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Khushboo Thaker

Bringing Structure into Summaries: a Faceted Summarization Dataset for Long Scientific Documents

Jun 23, 2021

Rui Meng, Khushboo Thaker, Lei Zhang, Yue Dong, Xingdi Yuan, Tong Wang, Daqing He

Figure 1 for Bringing Structure into Summaries: a Faceted Summarization Dataset for Long Scientific Documents

Figure 2 for Bringing Structure into Summaries: a Faceted Summarization Dataset for Long Scientific Documents

Figure 3 for Bringing Structure into Summaries: a Faceted Summarization Dataset for Long Scientific Documents

Figure 4 for Bringing Structure into Summaries: a Faceted Summarization Dataset for Long Scientific Documents

Abstract:Faceted summarization provides briefings of a document from different perspectives. Readers can quickly comprehend the main points of a long document with the help of a structured outline. However, little research has been conducted on this subject, partially due to the lack of large-scale faceted summarization datasets. In this study, we present FacetSum, a faceted summarization benchmark built on Emerald journal articles, covering a diverse range of domains. Different from traditional document-summary pairs, FacetSum provides multiple summaries, each targeted at specific sections of a long document, including the purpose, method, findings, and value. Analyses and empirical results on our dataset reveal the importance of bringing structure into summaries. We believe FacetSum will spur further advances in summarization research and foster the development of NLP systems that can leverage the structured information in both long texts and summaries.

* Accepted at ACL2021

Via

Access Paper or Ask Questions

Generating Diverse Numbers of Diverse Keyphrases

Oct 11, 2018

Xingdi Yuan, Tong Wang, Rui Meng, Khushboo Thaker, Daqing He, Adam Trischler

Figure 1 for Generating Diverse Numbers of Diverse Keyphrases

Figure 2 for Generating Diverse Numbers of Diverse Keyphrases

Figure 3 for Generating Diverse Numbers of Diverse Keyphrases

Figure 4 for Generating Diverse Numbers of Diverse Keyphrases

Abstract:Existing keyphrase generation studies suffer from the problems of generating duplicate phrases and deficient evaluation based on a fixed number of predicted phrases. We propose a recurrent generative model that generates multiple keyphrases sequentially from a text, with specific modules that promote generation diversity. We further propose two new metrics that consider a variable number of phrases. With both existing and proposed evaluation setups, our model demonstrates superior performance to baselines on three types of keyphrase generation datasets, including two newly introduced in this work: StackExchange and TextWorld ACG. In contrast to previous keyphrase generation approaches, our model generates sets of diverse keyphrases of a variable number.

* 15 pages, Currently under review

Via

Access Paper or Ask Questions