Abstract:Medical image classification is one of the most critical problems in the image recognition area. One of the major challenges in this field is the scarcity of labelled training data. Additionally, there is often class imbalance in datasets as some cases are very rare to happen. As a result, accuracy in classification task is normally low. Deep Learning models, in particular, show promising results on image segmentation and classification problems, but they require very large datasets for training. Therefore, there is a need to generate more of synthetic samples from the same distribution. Previous work has shown that feature generation is more efficient and leads to better performance than corresponding image generation. We apply this idea in the Medical Imaging domain. We use transfer learning to train a segmentation model for the small dataset for which gold-standard class annotations are available. We extracted the learnt features and use them to generate synthetic features conditioned on class labels, using Auxiliary Classifier GAN (ACGAN). We test the quality of the generated features in a downstream classification task for brain tumors according to their severity level. Experimental results show a promising result regarding the validity of these generated features and their overall contribution to balancing the data and improving the classification class-wise accuracy.
Abstract:Detection of offensive language in social media is one of the key challenges for social media. Researchers have proposed many advanced methods to accomplish this task. In this report, we try to use the learnings from their approach and incorporate our ideas to improve upon them. We have successfully achieved an accuracy of 74% in classifying offensive tweets. We also list upcoming challenges in the abusive content detection in the social media world.
Abstract:One of the vital breakthroughs in the history of machine translation is the development of the Transformer model. Not only it is revolutionary for various translation tasks, but also for a majority of other NLP tasks. In this paper, we aim at a Transformer-based system that is able to translate a source sentence in German to its counterpart target sentence in English. We perform the experiments on the news commentary German-English parallel sentences from the WMT'13 dataset. In addition, we investigate the effect of the inclusion of additional general-domain data in training from the IWSLT'16 dataset to improve the Transformer model performance. We find that including the IWSLT'16 dataset in training helps achieve a gain of 2 BLEU score points on the test set of the WMT'13 dataset. Qualitative analysis is introduced to analyze how the usage of general-domain data helps improve the quality of the produced translation sentences.
Abstract:Histopathology refers to the examination by a pathologist of biopsy samples. Histopathology images are captured by a microscope to locate, examine, and classify many diseases, such as different cancer types. They provide a detailed view of different types of diseases and their tissue status. These images are an essential resource with which to define biological compositions or analyze cell and tissue structures. This imaging modality is very important for diagnostic applications. The analysis of histopathology images is a prolific and relevant research area supporting disease diagnosis. In this paper, the challenges of histopathology image analysis are evaluated. An extensive review of conventional and deep learning techniques which have been applied in histological image analyses is presented. This review summarizes many current datasets and highlights important challenges and constraints with recent deep learning techniques, alongside possible future research avenues. Despite the progress made in this research area so far, it is still a significant area of open research because of the variety of imaging techniques and disease-specific characteristics.