Picture for Michal Hradiš

Michal Hradiš

AnnoPage Dataset: Dataset of Non-Textual Elements in Documents with Fine-Grained Categorization

Add code
Mar 28, 2025
Viaarxiv icon

Masked Self-Supervised Pre-Training for Text Recognition Transformers on Large-Scale Datasets

Add code
Mar 28, 2025
Viaarxiv icon

Practical Fine-Tuning of Autoregressive Models on Limited Handwritten Texts

Add code
Mar 25, 2025
Viaarxiv icon

BiblioPage: A Dataset of Scanned Title Pages for Bibliographic Metadata Extraction

Add code
Mar 25, 2025
Viaarxiv icon

TextBite: A Historical Czech Document Dataset for Logical Page Segmentation

Add code
Mar 20, 2025
Viaarxiv icon

Self-supervised Pre-training of Text Recognizers

Add code
May 01, 2024
Viaarxiv icon

Finetuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition

Add code
Feb 13, 2023
Viaarxiv icon

Towards Writing Style Adaptation in Handwriting Recognition

Add code
Feb 13, 2023
Viaarxiv icon

SoftCTC $\unicode{x2013}$ Semi-Supervised Learning for Text Recognition using Soft Pseudo-Labels

Add code
Dec 05, 2022
Viaarxiv icon

Importance of Textlines in Historical Document Classification

Add code
Jan 24, 2022
Figure 1 for Importance of Textlines in Historical Document Classification
Figure 2 for Importance of Textlines in Historical Document Classification
Figure 3 for Importance of Textlines in Historical Document Classification
Figure 4 for Importance of Textlines in Historical Document Classification
Viaarxiv icon