Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:LEWIS: Latent Embeddings for Word Images and their Semantics

Sep 21, 2015

Albert Gordo, Jon Almazan, Naila Murray, Florent Perronnin

Figure 1 for LEWIS: Latent Embeddings for Word Images and their Semantics

Figure 2 for LEWIS: Latent Embeddings for Word Images and their Semantics

Figure 3 for LEWIS: Latent Embeddings for Word Images and their Semantics

Figure 4 for LEWIS: Latent Embeddings for Word Images and their Semantics

Share this with someone who'll enjoy it:

Abstract:The goal of this work is to bring semantics into the tasks of text recognition and retrieval in natural images. Although text recognition and retrieval have received a lot of attention in recent years, previous works have focused on recognizing or retrieving exactly the same word used as a query, without taking the semantics into consideration. In this paper, we ask the following question: \emph{can we predict semantic concepts directly from a word image, without explicitly trying to transcribe the word image or its characters at any point?} For this goal we propose a convolutional neural network (CNN) with a weighted ranking loss objective that ensures that the concepts relevant to the query image are ranked ahead of those that are not relevant. This can also be interpreted as learning a Euclidean space where word images and concepts are jointly embedded. This model is learned in an end-to-end manner, from image pixels to semantic concepts, using a dataset of synthetically generated word images and concepts mined from a lexical database (WordNet). Our results show that, despite the complexity of the task, word images and concepts can indeed be associated with a high degree of accuracy

* Accepted for publication at the International Conference on Computer Vision (ICCV) 2015

View paper on

Share this with someone who'll enjoy it:

Title:LEWIS: Latent Embeddings for Word Images and their Semantics

Paper and Code