Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:AugSumm: towards generalizable speech summarization using synthetic labels from large language model

Jan 10, 2024

Jee-weon Jung, Roshan Sharma, William Chen, Bhiksha Raj, Shinji Watanabe

Figure 1 for AugSumm: towards generalizable speech summarization using synthetic labels from large language model

Figure 2 for AugSumm: towards generalizable speech summarization using synthetic labels from large language model

Figure 3 for AugSumm: towards generalizable speech summarization using synthetic labels from large language model

Figure 4 for AugSumm: towards generalizable speech summarization using synthetic labels from large language model

Share this with someone who'll enjoy it:

Abstract:Abstractive speech summarization (SSUM) aims to generate human-like summaries from speech. Given variations in information captured and phrasing, recordings can be summarized in multiple ways. Therefore, it is more reasonable to consider a probabilistic distribution of all potential summaries rather than a single summary. However, conventional SSUM models are mostly trained and evaluated with a single ground-truth (GT) human-annotated deterministic summary for every recording. Generating multiple human references would be ideal to better represent the distribution statistically, but is impractical because annotation is expensive. We tackle this challenge by proposing AugSumm, a method to leverage large language models (LLMs) as a proxy for human annotators to generate augmented summaries for training and evaluation. First, we explore prompting strategies to generate synthetic summaries from ChatGPT. We validate the quality of synthetic summaries using multiple metrics including human evaluation, where we find that summaries generated using AugSumm are perceived as more valid to humans. Second, we develop methods to utilize synthetic summaries in training and evaluation. Experiments on How2 demonstrate that pre-training on synthetic summaries and fine-tuning on GT summaries improves ROUGE-L by 1 point on both GT and AugSumm-based test sets. AugSumm summaries are available at https://github.com/Jungjee/AugSumm.

* This work has been submitted to the IEEE ICASSP for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. 5 pages

View paper on

Share this with someone who'll enjoy it:

Title:AugSumm: towards generalizable speech summarization using synthetic labels from large language model

Paper and Code