Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Enhancing Handwritten Text Recognition with N-gram sequence decomposition and Multitask Learning

Dec 28, 2020

Vasiliki Tassopoulou, George Retsinas, Petros Maragos

Figure 1 for Enhancing Handwritten Text Recognition with N-gram sequence decomposition and Multitask Learning

Figure 2 for Enhancing Handwritten Text Recognition with N-gram sequence decomposition and Multitask Learning

Figure 3 for Enhancing Handwritten Text Recognition with N-gram sequence decomposition and Multitask Learning

Figure 4 for Enhancing Handwritten Text Recognition with N-gram sequence decomposition and Multitask Learning

Share this with someone who'll enjoy it:

Abstract:Current state-of-the-art approaches in the field of Handwritten Text Recognition are predominately single task with unigram, character level target units. In our work, we utilize a Multi-task Learning scheme, training the model to perform decompositions of the target sequence with target units of different granularity, from fine to coarse. We consider this method as a way to utilize n-gram information, implicitly, in the training process, while the final recognition is performed using only the unigram output. % in order to highlight the difference of the internal Unigram decoding of such a multi-task approach highlights the capability of the learned internal representations, imposed by the different n-grams at the training step. We select n-grams as our target units and we experiment from unigrams to fourgrams, namely subword level granularities. These multiple decompositions are learned from the network with task-specific CTC losses. Concerning network architectures, we propose two alternatives, namely the Hierarchical and the Block Multi-task. Overall, our proposed model, even though evaluated only on the unigram task, outperforms its counterpart single-task by absolute 2.52\% WER and 1.02\% CER, in the greedy decoding, without any computational overhead during inference, hinting towards successfully imposing an implicit language model.

* ICPR 2020

View paper on

Share this with someone who'll enjoy it:

Title:Enhancing Handwritten Text Recognition with N-gram sequence decomposition and Multitask Learning

Paper and Code