Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains

Feb 05, 2024

Sanjana Ramprasad, Kundan Krishna, Zachary C Lipton, Byron C Wallace

Figure 1 for Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains

Figure 2 for Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains

Figure 3 for Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains

Figure 4 for Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains

Share this with someone who'll enjoy it:

Abstract:Recent work has shown that large language models (LLMs) are capable of generating summaries zero-shot (i.e., without explicit supervision) that, under human assessment, are often comparable or even preferred to manually composed reference summaries. However, this prior work has focussed almost exclusively on evaluating news article summarization. How do zero-shot summarizers perform in other (potentially more specialized) domains? In this work we evaluate zero-shot generated summaries across specialized domains including biomedical articles, and legal bills (in addition to standard news benchmarks for reference). We focus especially on the factuality of outputs. We acquire annotations from domain experts to identify inconsistencies in summaries and systematically categorize these errors. We analyze whether the prevalence of a given domain in the pretraining corpus affects extractiveness and faithfulness of generated summaries of articles in this domain. We release all collected annotations to facilitate additional research toward measuring and realizing factually accurate summarization, beyond news articles. The dataset can be downloaded from https://github.com/sanjanaramprasad/zero_shot_faceval_domains

View paper on

Share this with someone who'll enjoy it:

Title:Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains

Paper and Code