Abstract:Image-to-image translation is the recent trend to transform images from one domain to another domain using generative adversarial network (GAN). The existing GAN models perform the training by only utilizing the input and output modalities of transformation. In this paper, we perform the semantic injected training of GAN models. Specifically, we train with original input and output modalities and inject a few epochs of training for translation from input to semantic map. Lets refer the original training as the training for the translation of input image into target domain. The injection of semantic training in the original training improves the generalization capability of the trained GAN model. Moreover, it also preserves the categorical information in a better way in the generated image. The semantic map is only utilized at the training time and is not required at the test time. The experiments are performed using state-of-the-art GAN models over CityScapes and RGB-NIR stereo datasets. We observe the improved performance in terms of the SSIM, FID and KID scores after injecting semantic training as compared to original training.
Abstract:A balanced speech corpus is the basic need for any speech processing task. In this report we describe our effort on development of Assamese speech corpus. We mainly focused on some issues and challenges faced during development of the corpus. Being a less computationally aware language, this is the first effort to develop speech corpus for Assamese. As corpus development is an ongoing process, in this paper we report only the initial task.