Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval

Jul 16, 2020

Christopher Thomas, Adriana Kovashka

Figure 1 for Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval

Figure 2 for Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval

Figure 3 for Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval

Figure 4 for Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval

Share this with someone who'll enjoy it:

Abstract:The abundance of multimodal data (e.g. social media posts) has inspired interest in cross-modal retrieval methods. Popular approaches rely on a variety of metric learning losses, which prescribe what the proximity of image and text should be, in the learned space. However, most prior methods have focused on the case where image and text convey redundant information; in contrast, real-world image-text pairs convey complementary information with little overlap. Further, images in news articles and media portray topics in a visually diverse fashion; thus, we need to take special care to ensure a meaningful image representation. We propose novel within-modality losses which encourage semantic coherency in both the text and image subspaces, which does not necessarily align with visual coherency. Our method ensures that not only are paired images and texts close, but the expected image-image and text-text relationships are also observed. Our approach improves the results of cross-modal retrieval on four datasets compared to five baselines.

* Proceedings of the European Conference on Computer Vision (ECCV) 2020

View paper on

Share this with someone who'll enjoy it:

Title:Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval

Paper and Code