Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Anders Brun

Neural Word Search in Historical Manuscript Collections

Dec 06, 2018

Tomas Wilkinson, Jonas Lindström, Anders Brun

Figure 1 for Neural Word Search in Historical Manuscript Collections

Figure 2 for Neural Word Search in Historical Manuscript Collections

Figure 3 for Neural Word Search in Historical Manuscript Collections

Figure 4 for Neural Word Search in Historical Manuscript Collections

Abstract:We address the problem of segmenting and retrieving word images in collections of historical manuscripts given a text query. This is commonly referred to as "word spotting". To this end, we first propose an end-to-end trainable model based on deep neural networks that we dub Ctrl-F-Net. The model simultaneously generates region proposals and embeds them into a word embedding space, wherein a search is performed. We further introduce a simplified version called Ctrl-F-Mini. It is faster with similar performance, though it is limited to more easily segmented manuscripts. We evaluate both models on common benchmark datasets and surpass the previous state of the art. Finally, in collaboration with historians, we employ the Ctrl-F-Net to search within a large manuscript collection of over 100 thousand pages, written across two centuries. With only 11 training pages, we enable large scale data collection in manuscript-based historical research. This results in a speed up of data collection and the number of manuscripts processed by orders of magnitude. Given the time consuming manual work required to study old manuscripts in the humanities, quick and robust tools for word spotting has the potential to revolutionise domains like history, religion and language.

* Extension of arXiv:1703.07645

Via

Access Paper or Ask Questions

PDNet: Semantic Segmentation integrated with a Primal-Dual Network for Document binarization

May 17, 2018

Kalyan Ram Ayyalasomayajula, Filip Malmberg, Anders Brun

Figure 1 for PDNet: Semantic Segmentation integrated with a Primal-Dual Network for Document binarization

Figure 2 for PDNet: Semantic Segmentation integrated with a Primal-Dual Network for Document binarization

Figure 3 for PDNet: Semantic Segmentation integrated with a Primal-Dual Network for Document binarization

Figure 4 for PDNet: Semantic Segmentation integrated with a Primal-Dual Network for Document binarization

Abstract:Binarization of digital documents is the task of classifying each pixel in an image of the document as belonging to the background (parchment/paper) or foreground (text/ink). Historical documents are often subjected to degradations, that make the task challenging. In the current work a deep neural network architecture is proposed that combines a fully convolutional network with an unrolled primal-dual network that can be trained end-to-end to achieve state of the art binarization on four out of seven datasets. Document binarization is formulated as an energy minimization problem. A fully convolutional neural network is trained for semantic segmentation of pixels that provides labeling cost associated with each pixel. This cost estimate is refined along the edges to compensate for any over or under estimation of the foreground class using a primal-dual approach. We provide necessary overview on proximal operator that facilitates theoretical underpinning required to train a primal-dual network using a gradient descent algorithm. Numerical instabilities encountered due to the recurrent nature of primal-dual approach are handled. We provide experimental results on document binarization competition dataset along with network changes and hyperparameter tuning required for stability and performance of the network. The network when pre-trained on synthetic dataset performs better as per the competition metrics.

* Under consideration for Pattern Recognition Letters Special Issue on Graphonomics for e-citizens: e-health, e-society, e-education 11 pages, 10 figures, 2 tables

Via

Access Paper or Ask Questions

Neural Ctrl-F: Segmentation-free Query-by-String Word Spotting in Handwritten Manuscript Collections

Aug 17, 2017

Tomas Wilkinson, Jonas Lindström, Anders Brun

Figure 1 for Neural Ctrl-F: Segmentation-free Query-by-String Word Spotting in Handwritten Manuscript Collections

Figure 2 for Neural Ctrl-F: Segmentation-free Query-by-String Word Spotting in Handwritten Manuscript Collections

Figure 3 for Neural Ctrl-F: Segmentation-free Query-by-String Word Spotting in Handwritten Manuscript Collections

Figure 4 for Neural Ctrl-F: Segmentation-free Query-by-String Word Spotting in Handwritten Manuscript Collections

Abstract:In this paper, we approach the problem of segmentation-free query-by-string word spotting for handwritten documents. In other words, we use methods inspired from computer vision and machine learning to search for words in large collections of digitized manuscripts. In particular, we are interested in historical handwritten texts, which are often far more challenging than modern printed documents. This task is important, as it provides people with a way to quickly find what they are looking for in large collections that are tedious and difficult to read manually. To this end, we introduce an end-to-end trainable model based on deep neural networks that we call Ctrl-F-Net. Given a full manuscript page, the model simultaneously generates region proposals, and embeds these into a distributed word embedding space, where searches are performed. We evaluate the model on common benchmarks for handwritten word spotting, outperforming the previous state-of-the-art segmentation-free approaches by a large margin, and in some cases even segmentation-based approaches. One interesting real-life application of our approach is to help historians to find and count specific words in court records that are related to women's sustenance activities and division of labor. We provide promising preliminary experiments that validate our method on this task.

* To appear in ICCV 2017

Via

Access Paper or Ask Questions