Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data

Jan 26, 2023

Dong-Jin Kim, Tae-Hyun Oh, Jinsoo Choi, In So Kweon

Figure 1 for Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data

Figure 2 for Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data

Figure 3 for Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data

Figure 4 for Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data

Share this with someone who'll enjoy it:

Abstract:We present a novel data-efficient semi-supervised framework to improve the generalization of image captioning models. Constructing a large-scale labeled image captioning dataset is an expensive task in terms of labor, time, and cost. In contrast to manually annotating all the training samples, separately collecting uni-modal datasets is immensely easier, e.g., a large-scale image dataset and a sentence dataset. We leverage such massive unpaired image and caption data upon standard paired data by learning to associate them. To this end, our proposed semi-supervised learning method assigns pseudo-labels to unpaired samples in an adversarial learning fashion, where the joint distribution of image and caption is learned. Our method trains a captioner to learn from a paired data and to progressively associate unpaired data. This approach shows noticeable performance improvement even in challenging scenarios including out-of-task data (i.e., relational captioning, where the target task is different from the unpaired data) and web-crawled data. We also show that our proposed method is theoretically well-motivated and has a favorable global optimal property. Our extensive and comprehensive empirical results both on (1) image-based and (2) dense region-based captioning datasets followed by comprehensive analysis on the scarcely-paired COCO dataset demonstrate the consistent effectiveness of our semisupervised learning method with unpaired data compared to competing methods.

* Journal extension of our EMNLP 2019 paper (arXiv:1909.02201)

View paper on

Share this with someone who'll enjoy it:

Title:Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data

Paper and Code