Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Dual-Stream Transformer for Generic Event Boundary Captioning

Jul 07, 2022

Xin Gu, Hanhua Ye, Guang Chen, Yufei Wang, Libo Zhang, Longyin Wen

Figure 1 for Dual-Stream Transformer for Generic Event Boundary Captioning

Figure 2 for Dual-Stream Transformer for Generic Event Boundary Captioning

Figure 3 for Dual-Stream Transformer for Generic Event Boundary Captioning

Figure 4 for Dual-Stream Transformer for Generic Event Boundary Captioning

Share this with someone who'll enjoy it:

Abstract:This paper describes our champion solution for the CVPR2022 Generic Event Boundary Captioning (GEBC) competition. GEBC requires the captioning model to have a comprehension of instantaneous status changes around the given video boundary, which makes it much more challenging than conventional video captioning task. In this paper, a Dual-Stream Transformer with improvements on both video content encoding and captions generation is proposed: (1) We utilize three pre-trained models to extract the video features from different granularities. Moreover, we exploit the types of boundary as hints to help the model generate captions. (2) We particularly design an model, termed as Dual-Stream Transformer, to learn discriminative representations for boundary captioning. (3) Towards generating content-relevant and human-like captions, we improve the description quality by designing a word-level ensemble strategy. The promising results on the GEBC test split demonstrate the efficacy of our proposed model.

View paper on

Share this with someone who'll enjoy it:

Title:Dual-Stream Transformer for Generic Event Boundary Captioning

Paper and Code