Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Do Neural Network Cross-Modal Mappings Really Bridge Modalities?

Jun 04, 2018

Guillem Collell, Marie-Francine Moens

Figure 1 for Do Neural Network Cross-Modal Mappings Really Bridge Modalities?

Figure 2 for Do Neural Network Cross-Modal Mappings Really Bridge Modalities?

Figure 3 for Do Neural Network Cross-Modal Mappings Really Bridge Modalities?

Figure 4 for Do Neural Network Cross-Modal Mappings Really Bridge Modalities?

Share this with someone who'll enjoy it:

Abstract:Feed-forward networks are widely used in cross-modal applications to bridge modalities by mapping distributed vectors of one modality to the other, or to a shared space. The predicted vectors are then used to perform e.g., retrieval or labeling. Thus, the success of the whole system relies on the ability of the mapping to make the neighborhood structure (i.e., the pairwise similarities) of the predicted vectors akin to that of the target vectors. However, whether this is achieved has not been investigated yet. Here, we propose a new similarity measure and two ad hoc experiments to shed light on this issue. In three cross-modal benchmarks we learn a large number of language-to-vision and vision-to-language neural network mappings (up to five layers) using a rich diversity of image and text features and loss functions. Our results reveal that, surprisingly, the neighborhood structure of the predicted vectors consistently resembles more that of the input vectors than that of the target vectors. In a second experiment, we further show that untrained nets do not significantly disrupt the neighborhood (i.e., semantic) structure of the input vectors.

* To appear at ACL 2018

View paper on

Share this with someone who'll enjoy it:

Title:Do Neural Network Cross-Modal Mappings Really Bridge Modalities?

Paper and Code