Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

François Buet

Evaluating Subtitle Segmentation for End-to-end Generation Systems

May 19, 2022

Alina Karakanta, François Buet, Mauro Cettolo, François Yvon

Figure 1 for Evaluating Subtitle Segmentation for End-to-end Generation Systems

Figure 2 for Evaluating Subtitle Segmentation for End-to-end Generation Systems

Figure 3 for Evaluating Subtitle Segmentation for End-to-end Generation Systems

Figure 4 for Evaluating Subtitle Segmentation for End-to-end Generation Systems

Abstract:Subtitles appear on screen as short pieces of text, segmented based on formal constraints (length) and syntactic/semantic criteria. Subtitle segmentation can be evaluated with sequence segmentation metrics against a human reference. However, standard segmentation metrics cannot be applied when systems generate outputs different than the reference, e.g. with end-to-end subtitling systems. In this paper, we study ways to conduct reference-based evaluations of segmentation accuracy irrespective of the textual content. We first conduct a systematic analysis of existing metrics for evaluating subtitle segmentation. We then introduce $Sigma$, a new Subtitle Segmentation Score derived from an approximate upper-bound of BLEU on segmentation boundaries, which allows us to disentangle the effect of good segmentation from text quality. To compare $Sigma$ with existing metrics, we further propose a boundary projection method from imperfect hypotheses to the true reference. Results show that all metrics are able to reward high quality output but for similar outputs system ranking depends on each metric's sensitivity to error type. Our thorough analyses suggest $Sigma$ is a promising segmentation candidate but its reliability over other segmentation metrics remains to be validated through correlations with human judgements.

* Accepted at LREC 2022

Via

Access Paper or Ask Questions

Joint Generation of Captions and Subtitles with Dual Decoding

May 13, 2022

Jitao Xu, François Buet, Josep Crego, Elise Bertin-Lemée, François Yvon

Figure 1 for Joint Generation of Captions and Subtitles with Dual Decoding

Figure 2 for Joint Generation of Captions and Subtitles with Dual Decoding

Figure 3 for Joint Generation of Captions and Subtitles with Dual Decoding

Figure 4 for Joint Generation of Captions and Subtitles with Dual Decoding

Abstract:As the amount of audio-visual content increases, the need to develop automatic captioning and subtitling solutions to match the expectations of a growing international audience appears as the only viable way to boost throughput and lower the related post-production costs. Automatic captioning and subtitling often need to be tightly intertwined to achieve an appropriate level of consistency and synchronization with each other and with the video signal. In this work, we assess a dual decoding scheme to achieve a strong coupling between these two tasks and show how adequacy and consistency are increased, with virtually no additional cost in terms of model size and training complexity.

* Accepted at IWSLT 2022

Via

Access Paper or Ask Questions