Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mohammad Abuzar Shaikh

Self-Supervised Learning Based Handwriting Verification

May 28, 2024

Mihir Chauhan, Mohammad Abuzar Shaikh, Bina Ramamurthy, Mingchen Gao, Siwei Lyu, Sargur Srihari

Abstract:We present SSL-HV: Self-Supervised Learning approaches applied to the task of Handwriting Verification. This task involves determining whether a given pair of handwritten images originate from the same or different writer distribution. We have compared the performance of multiple generative, contrastive SSL approaches against handcrafted feature extractors and supervised learning on CEDAR AND dataset. We show that ResNet based Variational Auto-Encoder (VAE) outperforms other generative approaches achieving 76.3% accuracy, while ResNet-18 fine-tuned using Variance-Invariance-Covariance Regularization (VICReg) outperforms other contrastive approaches achieving 78% accuracy. Using a pre-trained VAE and VICReg for the downstream task of writer verification we observed a relative improvement in accuracy of 6.7% and 9% over ResNet-18 supervised baseline with 10% writer labels.

* 14 pages, 6 figures, 2 tables

Via

Access Paper or Ask Questions

Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment

Sep 04, 2021

Zhanghexuan Ji, Mohammad Abuzar Shaikh, Dana Moukheiber, Sargur Srihari, Yifan Peng, Mingchen Gao

Figure 1 for Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment

Figure 2 for Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment

Figure 3 for Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment

Figure 4 for Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment

Abstract:Self-supervised learning provides an opportunity to explore unlabeled chest X-rays and their associated free-text reports accumulated in clinical routine without manual supervision. This paper proposes a Joint Image Text Representation Learning Network (JoImTeRNet) for pre-training on chest X-ray images and their radiology reports. The model was pre-trained on both the global image-sentence level and the local image region-word level for visual-textual matching. Both are bidirectionally constrained on Cross-Entropy based and ranking-based Triplet Matching Losses. The region-word matching is calculated using the attention mechanism without direct supervision about their mapping. The pre-trained multi-modal representation learning paves the way for downstream tasks concerning image and/or text encoding. We demonstrate the representation learning quality by cross-modality retrievals and multi-label classifications on two datasets: OpenI-IU and MIMIC-CXR

* 10 Pages, 1 Figure, 3 Tables, Accepted in 12th Machine Learning in Medical Imaging (MLMI 2021) workshop

Via

Access Paper or Ask Questions

LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation

Sep 04, 2021

Mohammad Abuzar Shaikh, Zhanghexuan Ji, Dana Moukheiber, Sargur Srihari, Mingchen Gao

Figure 1 for LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation

Figure 2 for LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation

Figure 3 for LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation

Figure 4 for LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation

Abstract:Pre-training visual and textual representations from large-scale image-text pairs is becoming a standard approach for many downstream vision-language tasks. The transformer-based models learn inter and intra-modal attention through a list of self-supervised learning tasks. This paper proposes LAViTeR, a novel architecture for visual and textual representation learning. The main module, Visual Textual Alignment (VTA) will be assisted by two auxiliary tasks, GAN-based image synthesis and Image Captioning. We also propose a new evaluation metric measuring the similarity between the learnt visual and textual embedding. The experimental results on two public datasets, CUB and MS-COCO, demonstrate superior visual and textual representation alignment in the joint feature embedding space

* 14 pages, 10 Figures, 5 Tables

Via

Access Paper or Ask Questions

Soft-Attention Improves Skin Cancer Classification Performance

May 10, 2021

Soumyya Kanti Datta, Mohammad Abuzar Shaikh, Sargur N. Srihari, Mingchen Gao

Figure 1 for Soft-Attention Improves Skin Cancer Classification Performance

Figure 2 for Soft-Attention Improves Skin Cancer Classification Performance

Figure 3 for Soft-Attention Improves Skin Cancer Classification Performance

Figure 4 for Soft-Attention Improves Skin Cancer Classification Performance

Abstract:In clinical applications, neural networks must focus on and highlight the most important parts of an input image. Soft-Attention mechanism enables a neural network toachieve this goal. This paper investigates the effectiveness of Soft-Attention in deep neural architectures. The central aim of Soft-Attention is to boost the value of important features and suppress the noise-inducing features. We compare the performance of VGG, ResNet, InceptionResNetv2 and DenseNet architectures with and without the Soft-Attention mechanism, while classifying skin lesions. The original network when coupled with Soft-Attention outperforms the baseline[14] by 4.7% while achieving a precision of 93.7% on HAM10000 dataset. Additionally, Soft-Attention coupling improves the sensitivity score by 3.8% compared to baseline[28] and achieves 91.6% on ISIC-2017 dataset. The code is publicly available at github.

* 8 pages, 9 figures, 4 tables

Via

Access Paper or Ask Questions

Self-Supervised Claim Identification for Automated Fact Checking

Feb 03, 2021

Archita Pathak, Mohammad Abuzar Shaikh, Rohini Srihari

Figure 1 for Self-Supervised Claim Identification for Automated Fact Checking

Figure 2 for Self-Supervised Claim Identification for Automated Fact Checking

Figure 3 for Self-Supervised Claim Identification for Automated Fact Checking

Figure 4 for Self-Supervised Claim Identification for Automated Fact Checking

Abstract:We propose a novel, attention-based self-supervised approach to identify "claim-worthy" sentences in a fake news article, an important first step in automated fact-checking. We leverage "aboutness" of headline and content using attention mechanism for this task. The identified claims can be used for downstream task of claim verification for which we are releasing a benchmark dataset of manually selected compelling articles with veracity labels and associated evidence. This work goes beyond stylistic analysis to identifying content that influences reader belief. Experiments with three datasets show the strength of our model. Data and code available at https://github.com/architapathak/Self-Supervised-ClaimIdentification

* 15 pages, 4 figures, Accepted at ICON 2020

Via

Access Paper or Ask Questions

Attention based Writer Independent Handwriting Verification

Oct 01, 2020

Mohammad Abuzar Shaikh, Tiehang Duan, Mihir Chauhan, Sargur Srihari

Figure 1 for Attention based Writer Independent Handwriting Verification

Figure 2 for Attention based Writer Independent Handwriting Verification

Figure 3 for Attention based Writer Independent Handwriting Verification

Figure 4 for Attention based Writer Independent Handwriting Verification

Abstract:The task of writer verification is to provide a likelihood score for whether the queried and known handwritten image samples belong to the same writer or not. Such a task calls for the neural network to make it's outcome interpretable, i.e. provide a view into the network's decision making process. We implement and integrate cross-attention and soft-attention mechanisms to capture the highly correlated and salient points in feature space of 2D inputs. The attention maps serve as an explanation premise for the network's output likelihood score. The attention mechanism also allows the network to focus more on relevant areas of the input, thus improving the classification performance. Our proposed approach achieves a precision of 86\% for detecting intra-writer cases in CEDAR cursive "AND" dataset. Furthermore, we generate meaningful explanations for the provided decision by extracting attention maps from multiple levels of the network.

* 7 pages, 6 figures, Published in 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR)

Via

Access Paper or Ask Questions

Ultra Efficient Transfer Learning with Meta Update for Cross Subject EEG Classification

Mar 13, 2020

Tiehang Duan, Mihir Chauhan, Mohammad Abuzar Shaikh, Sargur Srihari

Figure 1 for Ultra Efficient Transfer Learning with Meta Update for Cross Subject EEG Classification

Figure 2 for Ultra Efficient Transfer Learning with Meta Update for Cross Subject EEG Classification

Figure 3 for Ultra Efficient Transfer Learning with Meta Update for Cross Subject EEG Classification

Figure 4 for Ultra Efficient Transfer Learning with Meta Update for Cross Subject EEG Classification

Abstract:Electroencephalogram (EEG) signal is widely used in brain computer interfaces (BCI), the pattern of which differs significantly across different subjects, and poses a major challenge for real world application of EEG classifiers. We found an efficient transfer learning method, named Meta UPdate Strategy (MUPS), boosts cross subject classification performance of EEG signals, and only need a small amount of data from target subject. The model tackles the problem with a two step process: (1) extract versatile features that are effective across all source subjects, and (2) adapt the model to target subject. The proposed model, which originates from meta learning, aims to find feature representation that is broadly suitable for different subjects, and maximizes sensitivity of the loss function on new subject such that one or a small number of gradient steps can lead to effective adaptation. The method can be applied to all deep learning oriented models. We performed extensive experiments on two public datasets, the proposed MUPS model outperforms current state of the arts by a large margin on accuracy and AUC-ROC when only a small amount of target data is used.

Via

Access Paper or Ask Questions

Explanation based Handwriting Verification

Aug 14, 2019

Mihir Chauhan, Mohammad Abuzar Shaikh, Sargur N. Srihari

Figure 1 for Explanation based Handwriting Verification

Figure 2 for Explanation based Handwriting Verification

Figure 3 for Explanation based Handwriting Verification

Figure 4 for Explanation based Handwriting Verification

Abstract:Deep learning system have drawback that their output is not accompanied with ex-planation. In a domain such as forensic handwriting verification it is essential to provideexplanation to jurors. The goal of handwriting verification is to find a measure of confi-dence whether the given handwritten samples are written by the same or different writer.We propose a method to generate explanations for the confidence provided by convolu-tional neural network (CNN) which maps the input image to 15 annotations (features)provided by experts. Our system comprises of: (1) Feature learning network (FLN),a differentiable system, (2) Inference module for providing explanations. Furthermore,inference module provides two types of explanations: (a) Based on cosine similaritybetween categorical probabilities of each feature, (b) Based on Log-Likelihood Ratio(LLR) using directed probabilistic graphical model. We perform experiments using acombination of feature learning network (FLN) and each inference module. We evaluateour system using XAI-AND dataset, containing 13700 handwritten samples and 15 cor-responding expert examined features for each sample. The dataset is released for publicuse and the methods can be extended to provide explanations on other verification taskslike face verification and bio-medical comparison. This dataset can serve as the basis and benchmark for future research in explanation based handwriting verification. The code is available on github.

* Presented at BMVC 2019: Workshop on Interpretable and Explainable Machine Vision, Cardiff, UK

Via

Access Paper or Ask Questions

Hybrid Feature Learning for Handwriting Verification

Nov 19, 2018

Mohammad Abuzar Shaikh, Mihir Chauhan, Jun Chu, Sargur Srihari

Figure 1 for Hybrid Feature Learning for Handwriting Verification

Figure 2 for Hybrid Feature Learning for Handwriting Verification

Figure 3 for Hybrid Feature Learning for Handwriting Verification

Figure 4 for Hybrid Feature Learning for Handwriting Verification

Abstract:We propose an effective Hybrid Deep Learning (HDL) architecture for the task of determining the probability that a questioned handwritten word has been written by a known writer. HDL is an amalgamation of Auto-Learned Features (ALF) and Human-Engineered Features (HEF). To extract auto-learned features we use two methods: First, Two Channel Convolutional Neural Network (TC-CNN); Second, Two Channel Autoencoder (TC-AE). Furthermore, human-engineered features are extracted by using two methods: First, Gradient Structural Concavity (GSC); Second, Scale Invariant Feature Transform (SIFT). Experiments are performed by complementing one of the HEF methods with one ALF method on 150000 pairs of samples of the word "AND" cropped from handwritten notes written by 1500 writers. Our results indicate that HDL architecture with AE-GSC achieves 99.7% accuracy on seen writer dataset and 92.16% accuracy on shuffled writer dataset which out performs CEDAR-FOX, as for unseen writer dataset, AE-SIFT performs comparable to this sophisticated handwriting comparison tool.

* Accepted and presented in International Conference on Frontiers in Handwriting Recognition (ICFHR) 2018

Via

Access Paper or Ask Questions