Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ritika Jha

RanLayNet: A Dataset for Document Layout Detection used for Domain Adaptation and Generalization

Apr 15, 2024

Avinash Anand, Raj Jaiswal, Mohit Gupta, Siddhesh S Bangar, Pijush Bhuyan, Naman Lal, Rajeev Singh, Ritika Jha, Rajiv Ratn Shah, Shin'ichi Satoh

Figure 1 for RanLayNet: A Dataset for Document Layout Detection used for Domain Adaptation and Generalization

Figure 2 for RanLayNet: A Dataset for Document Layout Detection used for Domain Adaptation and Generalization

Figure 3 for RanLayNet: A Dataset for Document Layout Detection used for Domain Adaptation and Generalization

Figure 4 for RanLayNet: A Dataset for Document Layout Detection used for Domain Adaptation and Generalization

Abstract:Large ground-truth datasets and recent advances in deep learning techniques have been useful for layout detection. However, because of the restricted layout diversity of these datasets, training on them requires a sizable number of annotated instances, which is both expensive and time-consuming. As a result, differences between the source and target domains may significantly impact how well these models function. To solve this problem, domain adaptation approaches have been developed that use a small quantity of labeled data to adjust the model to the target domain. In this research, we introduced a synthetic document dataset called RanLayNet, enriched with automatically assigned labels denoting spatial positions, ranges, and types of layout elements. The primary aim of this endeavor is to develop a versatile dataset capable of training models with robustness and adaptability to diverse document formats. Through empirical experimentation, we demonstrate that a deep layout identification model trained on our dataset exhibits enhanced performance compared to a model trained solely on actual documents. Moreover, we conduct a comparative analysis by fine-tuning inference models using both PubLayNet and IIIT-AR-13K datasets on the Doclaynet dataset. Our findings emphasize that models enriched with our dataset are optimal for tasks such as achieving 0.398 and 0.588 mAP95 score in the scientific document domain for the TABLE class.

Via

Access Paper or Ask Questions

MaskMTL: Attribute prediction in masked facial images with deep multitask learning

Jan 11, 2022

Prerana Mukherjee, Vinay Kaushik, Ronak Gupta, Ritika Jha, Daneshwari Kankanwadi, Brejesh Lall

Figure 1 for MaskMTL: Attribute prediction in masked facial images with deep multitask learning

Figure 2 for MaskMTL: Attribute prediction in masked facial images with deep multitask learning

Figure 3 for MaskMTL: Attribute prediction in masked facial images with deep multitask learning

Figure 4 for MaskMTL: Attribute prediction in masked facial images with deep multitask learning

Abstract:Predicting attributes in the landmark free facial images is itself a challenging task which gets further complicated when the face gets occluded due to the usage of masks. Smart access control gates which utilize identity verification or the secure login to personal electronic gadgets may utilize face as a biometric trait. Particularly, the Covid-19 pandemic increasingly validates the essentiality of hygienic and contactless identity verification. In such cases, the usage of masks become more inevitable and performing attribute prediction helps in segregating the target vulnerable groups from community spread or ensuring social distancing for them in a collaborative environment. We create a masked face dataset by efficiently overlaying masks of different shape, size and textures to effectively model variability generated by wearing mask. This paper presents a deep Multi-Task Learning (MTL) approach to jointly estimate various heterogeneous attributes from a single masked facial image. Experimental results on benchmark face attribute UTKFace dataset demonstrate that the proposed approach supersedes in performance to other competing techniques. The source code is available at https://github.com/ritikajha/Attribute-prediction-in-masked-facial-images-with-deep-multitask-learning

* In Proceedings of 9th International Conference on Pattern Recognition and Machine Intelligence (PReMI 2021), Kolkata, India

Via

Access Paper or Ask Questions