Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vinaychandran Pondenkandath

Generating Synthetic Handwritten Historical Documents With OCR Constrained GANs

Mar 15, 2021

Lars Vögtlin, Manuel Drazyk, Vinaychandran Pondenkandath, Michele Alberti, Rolf Ingold

Figure 1 for Generating Synthetic Handwritten Historical Documents With OCR Constrained GANs

Figure 2 for Generating Synthetic Handwritten Historical Documents With OCR Constrained GANs

Figure 3 for Generating Synthetic Handwritten Historical Documents With OCR Constrained GANs

Figure 4 for Generating Synthetic Handwritten Historical Documents With OCR Constrained GANs

Abstract:We present a framework to generate synthetic historical documents with precise ground truth using nothing more than a collection of unlabeled historical images. Obtaining large labeled datasets is often the limiting factor to effectively use supervised deep learning methods for Document Image Analysis (DIA). Prior approaches towards synthetic data generation either require expertise or result in poor accuracy in the synthetic documents. To achieve high precision transformations without requiring expertise, we tackle the problem in two steps. First, we create template documents with user-specified content and structure. Second, we transfer the style of a collection of unlabeled historical images to these template documents while preserving their text and layout. We evaluate the use of our synthetic historical documents in a pre-training setting and find that we outperform the baselines (randomly initialized and pre-trained). Additionally, with visual examples, we demonstrate a high-quality synthesis that makes it possible to generate large labeled historical document datasets with precise ground truth.

Via

Access Paper or Ask Questions

Labeling, Cutting, Grouping: an Efficient Text Line Segmentation Method for Medieval Manuscripts

Jul 01, 2019

Michele Alberti, Lars Vögtlin, Vinaychandran Pondenkandath, Mathias Seuret, Rolf Ingold, Marcus Liwicki

Figure 1 for Labeling, Cutting, Grouping: an Efficient Text Line Segmentation Method for Medieval Manuscripts

Figure 2 for Labeling, Cutting, Grouping: an Efficient Text Line Segmentation Method for Medieval Manuscripts

Figure 3 for Labeling, Cutting, Grouping: an Efficient Text Line Segmentation Method for Medieval Manuscripts

Figure 4 for Labeling, Cutting, Grouping: an Efficient Text Line Segmentation Method for Medieval Manuscripts

Abstract:This paper introduces a new way for text-line extraction by integrating deep-learning based pre-classification and state-of-the-art segmentation methods. Text-line extraction in complex handwritten documents poses a significant challenge, even to the most modern computer vision algorithms. Historical manuscripts are a particularly hard class of documents as they present several forms of noise, such as degradation, bleed-through, interlinear glosses, and elaborated scripts. In this work, we propose a novel method which uses semantic segmentation at pixel level as intermediate task, followed by a text-line extraction step. We measured the performance of our method on a recent dataset of challenging medieval manuscripts and surpassed state-of-the-art results by reducing the error by 80.7%. Furthermore, we demonstrate the effectiveness of our approach on various other datasets written in different scripts. Hence, our contribution is two-fold. First, we demonstrate that semantic pixel segmentation can be used as strong denoising pre-processing step before performing text line extraction. Second, we introduce a novel, simple and robust algorithm that leverages the high-quality semantic segmentation to achieve a text-line extraction performance of 99.42% line IU on a challenging dataset.

* 2019 15th IAPR International Conference on Document Analysis and Recognition (ICDAR), Sydney, Australia

Via

Access Paper or Ask Questions

Improving Reproducible Deep Learning Workflows with DeepDIVA

Jun 11, 2019

Michele Alberti, Vinaychandran Pondenkandath, Lars Vögtlin, Marcel Würsch, Rolf Ingold, Marcus Liwicki

Figure 1 for Improving Reproducible Deep Learning Workflows with DeepDIVA

Figure 2 for Improving Reproducible Deep Learning Workflows with DeepDIVA

Abstract:The field of deep learning is experiencing a trend towards producing reproducible research. Nevertheless, it is still often a frustrating experience to reproduce scientific results. This is especially true in the machine learning community, where it is considered acceptable to have black boxes in your experiments. We present DeepDIVA, a framework designed to facilitate easy experimentation and their reproduction. This framework allows researchers to share their experiments with others, while providing functionality that allows for easy experimentation, such as: boilerplate code, experiment management, hyper-parameter optimization, verification of data integrity and visualization of data and results. Additionally, the code of DeepDIVA is well-documented and supported by several tutorials that allow a new user to quickly familiarize themselves with the framework.

* 6th Swiss Conference on Data Science (SDS), Bern, Switzerland, 2019

Via

Access Paper or Ask Questions

Survey of Artificial Intelligence for Card Games and Its Application to the Swiss Game Jass

Jun 11, 2019

Joel Niklaus, Michele Alberti, Vinaychandran Pondenkandath, Rolf Ingold, Marcus Liwicki

Abstract:In the last decades we have witnessed the success of applications of Artificial Intelligence to playing games. In this work we address the challenging field of games with hidden information and card games in particular. Jass is a very popular card game in Switzerland and is closely connected with Swiss culture. To the best of our knowledge, performances of Artificial Intelligence agents in the game of Jass do not outperform top players yet. Our contribution to the community is two-fold. First, we provide an overview of the current state-of-the-art of Artificial Intelligence methods for card games in general. Second, we discuss their application to the use-case of the Swiss card game Jass. This paper aims to be an entry point for both seasoned researchers and new practitioners who want to join in the Jass challenge.

* 6th Swiss Conference on Data Science (SDS), Bern, Switzerland, 2019

Via

Access Paper or Ask Questions

A Comprehensive Study of ImageNet Pre-Training for Historical Document Image Analysis

May 22, 2019

Linda Studer, Michele Alberti, Vinaychandran Pondenkandath, Pinar Goktepe, Thomas Kolonko, Andreas Fischer, Marcus Liwicki, Rolf Ingold

Figure 1 for A Comprehensive Study of ImageNet Pre-Training for Historical Document Image Analysis

Figure 2 for A Comprehensive Study of ImageNet Pre-Training for Historical Document Image Analysis

Figure 3 for A Comprehensive Study of ImageNet Pre-Training for Historical Document Image Analysis

Figure 4 for A Comprehensive Study of ImageNet Pre-Training for Historical Document Image Analysis

Abstract:Automatic analysis of scanned historical documents comprises a wide range of image analysis tasks, which are often challenging for machine learning due to a lack of human-annotated learning samples. With the advent of deep neural networks, a promising way to cope with the lack of training data is to pre-train models on images from a different domain and then fine-tune them on historical documents. In the current research, a typical example of such cross-domain transfer learning is the use of neural networks that have been pre-trained on the ImageNet database for object recognition. It remains a mostly open question whether or not this pre-training helps to analyse historical documents, which have fundamentally different image properties when compared with ImageNet. In this paper, we present a comprehensive empirical survey on the effect of ImageNet pre-training for diverse historical document analysis tasks, including character recognition, style classification, manuscript dating, semantic segmentation, and content-based retrieval. While we obtain mixed results for semantic segmentation at pixel-level, we observe a clear trend across different network architectures that ImageNet pre-training has a positive effect on classification as well as content-based retrieval.

Via

Access Paper or Ask Questions

Leveraging Random Label Memorization for Unsupervised Pre-Training

Nov 05, 2018

Vinaychandran Pondenkandath, Michele Alberti, Sammer Puran, Rolf Ingold, Marcus Liwicki

Figure 1 for Leveraging Random Label Memorization for Unsupervised Pre-Training

Figure 2 for Leveraging Random Label Memorization for Unsupervised Pre-Training

Figure 3 for Leveraging Random Label Memorization for Unsupervised Pre-Training

Figure 4 for Leveraging Random Label Memorization for Unsupervised Pre-Training

Abstract:We present a novel approach to leverage large unlabeled datasets by pre-training state-of-the-art deep neural networks on randomly-labeled datasets. Specifically, we train the neural networks to memorize arbitrary labels for all the samples in a dataset and use these pre-trained networks as a starting point for regular supervised learning. Our assumption is that the "memorization infrastructure" learned by the network during the random-label training proves to be beneficial for the conventional supervised learning as well. We test the effectiveness of our pre-training on several video action recognition datasets (HMDB51, UCF101, Kinetics) by comparing the results of the same network with and without the random label pre-training. Our approach yields an improvement - ranging from 1.5% on UCF-101 to 5% on Kinetics - in classification accuracy, which calls for further research in this direction.

* 6 pages

Via

Access Paper or Ask Questions

Offline Signature Verification by Combining Graph Edit Distance and Triplet Networks

Oct 17, 2018

Paul Maergner, Vinaychandran Pondenkandath, Michele Alberti, Marcus Liwicki, Kaspar Riesen, Rolf Ingold, Andreas Fischer

Figure 1 for Offline Signature Verification by Combining Graph Edit Distance and Triplet Networks

Figure 2 for Offline Signature Verification by Combining Graph Edit Distance and Triplet Networks

Figure 3 for Offline Signature Verification by Combining Graph Edit Distance and Triplet Networks

Figure 4 for Offline Signature Verification by Combining Graph Edit Distance and Triplet Networks

Abstract:Biometric authentication by means of handwritten signatures is a challenging pattern recognition task, which aims to infer a writer model from only a handful of genuine signatures. In order to make it more difficult for a forger to attack the verification system, a promising strategy is to combine different writer models. In this work, we propose to complement a recent structural approach to offline signature verification based on graph edit distance with a statistical approach based on metric learning with deep neural networks. On the MCYT and GPDS benchmark datasets, we demonstrate that combining the structural and statistical models leads to significant improvements in performance, profiting from their complementary properties.

* Structural, Syntactic, and Statistical Pattern Recognition. S+SSPR 2018. Lecture Notes in Computer Science, vol 11004. Springer, Cham

Via

Access Paper or Ask Questions

Are You Tampering With My Data?

Aug 21, 2018

Michele Alberti, Vinaychandran Pondenkandath, Marcel Würsch, Manuel Bouillon, Mathias Seuret, Rolf Ingold, Marcus Liwicki

Figure 1 for Are You Tampering With My Data?

Figure 2 for Are You Tampering With My Data?

Figure 3 for Are You Tampering With My Data?

Figure 4 for Are You Tampering With My Data?

Abstract:We propose a novel approach towards adversarial attacks on neural networks (NN), focusing on tampering the data used for training instead of generating attacks on trained models. Our network-agnostic method creates a backdoor during training which can be exploited at test time to force a neural network to exhibit abnormal behaviour. We demonstrate on two widely used datasets (CIFAR-10 and SVHN) that a universal modification of just one pixel per image for all the images of a class in the training set is enough to corrupt the training procedure of several state-of-the-art deep neural networks causing the networks to misclassify any images to which the modification is applied. Our aim is to bring to the attention of the machine learning community, the possibility that even learning-based methods that are personally trained on public datasets can be subject to attacks by a skillful adversary.

* European Conference on Computer Vision (ECCV 2018), Workshop on Objectionable Content and Misinformation
* 18 pages

Via

Access Paper or Ask Questions

DeepDIVA: A Highly-Functional Python Framework for Reproducible Experiments

Apr 23, 2018

Michele Alberti, Vinaychandran Pondenkandath, Marcel Würsch, Rolf Ingold, Marcus Liwicki

Figure 1 for DeepDIVA: A Highly-Functional Python Framework for Reproducible Experiments

Figure 2 for DeepDIVA: A Highly-Functional Python Framework for Reproducible Experiments

Figure 3 for DeepDIVA: A Highly-Functional Python Framework for Reproducible Experiments

Figure 4 for DeepDIVA: A Highly-Functional Python Framework for Reproducible Experiments

Abstract:We introduce DeepDIVA: an infrastructure designed to enable quick and intuitive setup of reproducible experiments with a large range of useful analysis functionality. Reproducing scientific results can be a frustrating experience, not only in document image analysis but in machine learning in general. Using DeepDIVA a researcher can either reproduce a given experiment with a very limited amount of information or share their own experiments with others. Moreover, the framework offers a large range of functions, such as boilerplate code, keeping track of experiments, hyper-parameter optimization, and visualization of data and results. To demonstrate the effectiveness of this framework, this paper presents case studies in the area of handwritten document analysis where researchers benefit from the integrated functionality. DeepDIVA is implemented in Python and uses the deep learning framework PyTorch. It is completely open source, and accessible as Web Service through DIVAServices.

* Submitted at the 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), 6 pages, 6 Figures

Via

Access Paper or Ask Questions

Identifying Cross-Depicted Historical Motifs

Apr 05, 2018

Vinaychandran Pondenkandath, Michele Alberti, Nicole Eichenberger, Rolf Ingold, Marcus Liwicki

Figure 1 for Identifying Cross-Depicted Historical Motifs

Figure 2 for Identifying Cross-Depicted Historical Motifs

Figure 3 for Identifying Cross-Depicted Historical Motifs

Figure 4 for Identifying Cross-Depicted Historical Motifs

Abstract:Cross-depiction is the problem of identifying the same object even when it is depicted in a variety of manners. This is a common problem in handwritten historical documents image analysis, for instance when the same letter or motif is depicted in several different ways. It is a simple task for humans yet conventional heuristic computer vision methods struggle to cope with it. In this paper we address this problem using state-of-the-art deep learning techniques on a dataset of historical watermarks containing images created with different methods of reproduction, such as hand tracing, rubbing, and radiography. To study the robustness of deep learning based approaches to the cross-depiction problem, we measure their performance on two different tasks: classification and similarity rankings. For the former we achieve a classification accuracy of 96% using deep convolutional neural networks. For the latter we have a false positive rate at 95% true positive rate of 0.11. These results outperform state-of-the-art methods by a significant margin.

* 6 pages, 6 figures, Submitted at the 16th International Conference on Frontiers in Handwriting Recognition (ICFHR 2018)

Via

Access Paper or Ask Questions