Abstract:In this paper, we propose a novel approach of word-level Indic script identification using only character-level data in training stage. The advantages of using character level data for training have been outlined in section I. Our method uses a multimodal deep network which takes both offline and online modality of the data as input in order to explore the information from both the modalities jointly for script identification task. We take handwritten data in either modality as input and the opposite modality is generated through intermodality conversion. Thereafter, we feed this offline-online modality pair to our network. Hence, along with the advantage of utilizing information from both the modalities, it can work as a single framework for both offline and online script identification simultaneously which alleviates the need for designing two separate script identification modules for individual modality. One more major contribution is that we propose a novel conditional multimodal fusion scheme to combine the information from offline and online modality which takes into account the real origin of the data being fed to our network and thus it combines adaptively. An exhaustive experiment has been done on a data set consisting of English and six Indic scripts. Our proposed framework clearly outperforms different frameworks based on traditional classifiers along with handcrafted features and deep learning based methods with a clear margin. Extensive experiments show that using only character level training data can achieve state-of-art performance similar to that obtained with traditional training using word level data in our framework.