Abstract:Deep learning and computer vision methods are nowadays predominantly used in the field of ophthalmology. In this paper, we present an attention-aided DenseNet-121 for classifying normal and glaucomatous eyes from fundus images. It involves the convolutional block attention module to highlight relevant spatial and channel features extracted by DenseNet-121. The channel recalibration module further enriches the features by utilizing edge information along with the statistical features of the spatial dimension. For the experiments, two standard datasets, namely RIM-ONE and ACRIMA, have been used. Our method has shown superior results than state-of-the-art models. An ablation study has also been conducted to show the effectiveness of each of the components. The code of the proposed work is available at: https://github.com/Soham2004GitHub/DADGC.
Abstract:Skin cancer is a highly dangerous type of cancer that requires an accurate diagnosis from experienced physicians. To help physicians diagnose skin cancer more efficiently, a computer-aided diagnosis (CAD) system can be very helpful. In this paper, we propose a novel model, which uses a novel attention mechanism to pinpoint the differences in features across the spatial dimensions and symmetry of the lesion, thereby focusing on the dissimilarities of various classes based on symmetry, uniformity in texture and color, etc. Additionally, to take into account the variations in the boundaries of the lesions for different classes, we employ a gradient-based fusion of wavelet and soft attention-aided features to extract boundary information of skin lesions. We have tested our model on the multi-class and highly class-imbalanced dataset, called HAM10000, and achieved promising results, with a 91.17\% F1-score and 90.75\% accuracy. The code is made available at: https://github.com/AyushRoy2001/WAGF-Fusion.
Abstract:Pneumonia is a respiratory infection caused by bacteria, fungi, or viruses. It affects many people, particularly those in developing or underdeveloped nations with high pollution levels, unhygienic living conditions, overcrowding, and insufficient medical infrastructure. Pneumonia can cause pleural effusion, where fluids fill the lungs, leading to respiratory difficulty. Early diagnosis is crucial to ensure effective treatment and increase survival rates. Chest X-ray imaging is the most commonly used method for diagnosing pneumonia. However, visual examination of chest X-rays can be difficult and subjective. In this study, we have developed a computer-aided diagnosis system for automatic pneumonia detection using chest X-ray images. We have used DenseNet-121 and ResNet50 as the backbone for the binary class (pneumonia and normal) and multi-class (bacterial pneumonia, viral pneumonia, and normal) classification tasks, respectively. We have also implemented a channel-specific spatial attention mechanism, called Fuzzy Channel Selective Spatial Attention Module (FCSSAM), to highlight the specific spatial regions of relevant channels while removing the irrelevant channels of the extracted features by the backbone. We evaluated the proposed approach on a publicly available chest X-ray dataset, using binary and multi-class classification setups. Our proposed method achieves accuracy rates of 97.15\% and 79.79\% for the binary and multi-class classification setups, respectively. The results of our proposed method are superior to state-of-the-art (SOTA) methods. The code of the proposed model will be available at: https://github.com/AyushRoy2001/FA-Net.
Abstract:Accurate nuclei segmentation in histopathological images is crucial for cancer diagnosis. Automating this process offers valuable support to clinical experts, as manual annotation is time-consuming and prone to human errors. However, automating nuclei segmentation presents challenges due to uncertain cell boundaries, intricate staining, and diverse structures. In this paper, we present a segmentation approach that combines the U-Net architecture with a DenseNet-121 backbone, harnessing the strengths of both to capture comprehensive contextual and spatial information. Our model introduces the Wavelet-guided channel attention module to enhance cell boundary delineation, along with a learnable weighted global attention module for channel-specific attention. The decoder module, composed of an upsample block and convolution block, further refines segmentation in handling staining patterns. The experimental results conducted on two publicly accessible histopathology datasets, namely Monuseg and TNBC, underscore the superiority of our proposed model, demonstrating its potential to advance histopathological image analysis and cancer diagnosis. The code is made available at: https://github.com/AyushRoy2001/AWGUNET.
Abstract:Breast cancer is a major global health concern. Pathologists face challenges in analyzing complex features from pathological images, which is a time-consuming and labor-intensive task. Therefore, efficient computer-based diagnostic tools are needed for early detection and treatment planning. This paper presents a modified version of MultiResU-Net for histopathology image segmentation, which is selected as the backbone for its ability to analyze and segment complex features at multiple scales and ensure effective feature flow via skip connections. The modified version also utilizes the Gaussian distribution-based Attention Module (GdAM) to incorporate histopathology-relevant text information in a Gaussian distribution. The sampled features from the Gaussian text feature-guided distribution highlight specific spatial regions based on prior knowledge. Finally, using the Controlled Dense Residual Block (CDRB) on skip connections of MultiResU-Net, the information is transferred from the encoder layers to the decoder layers in a controlled manner using a scaling parameter derived from the extracted spatial features. We validate our approach on two diverse breast cancer histopathology image datasets: TNBC and MonuSeg, demonstrating superior segmentation performance compared to state-of-the-art methods. The code for our proposed model is available on https://github.com/AyushRoy2001/GRU-Net.
Abstract:Pneumonia is one of the major reasons for child mortality especially in income-deprived regions of the world. Although it can be detected and treated with very less sophisticated instruments and medication, Pneumonia detection still remains a major concern in developing countries. Computer-aided based diagnosis (CAD) systems can be used in such countries due to their lower operating costs than professional medical experts. In this paper, we propose a CAD system for Pneumonia detection from Chest X-rays, using the concepts of deep learning and a meta-heuristic algorithm. We first extract deep features from the pre-trained ResNet50, fine-tuned on a target Pneumonia dataset. Then, we propose a feature selection technique based on particle swarm optimization (PSO), which is modified using a memory-based adaptation parameter, and enriched by incorporating an altruistic behavior into the agents. We name our feature selection method as adaptive and altruistic PSO (AAPSO). The proposed method successfully eliminates non-informative features obtained from the ResNet50 model, thereby improving the Pneumonia detection ability of the overall framework. Extensive experimentation and thorough analysis on a publicly available Pneumonia dataset establish the superiority of the proposed method over several other frameworks used for Pneumonia detection. Apart from Pneumonia detection, AAPSO is further evaluated on some standard UCI datasets, gene expression datasets for cancer prediction and a COVID-19 prediction dataset. The overall results are satisfactory, thereby confirming the usefulness of AAPSO in dealing with varied real-life problems. The supporting source codes of this work can be found at https://github.com/rishavpramanik/AAPSO
Abstract:Segmentation is essential for medical image analysis to identify and localize diseases, monitor morphological changes, and extract discriminative features for further diagnosis. Skin cancer is one of the most common types of cancer globally, and its early diagnosis is pivotal for the complete elimination of malignant tumors from the body. This research develops an Artificial Intelligence (AI) framework for supervised skin lesion segmentation employing the deep learning approach. The proposed framework, called MFSNet (Multi-Focus Segmentation Network), uses differently scaled feature maps for computing the final segmentation mask using raw input RGB images of skin lesions. In doing so, initially, the images are preprocessed to remove unwanted artifacts and noises. The MFSNet employs the Res2Net backbone, a recently proposed convolutional neural network (CNN), for obtaining deep features used in a Parallel Partial Decoder (PPD) module to get a global map of the segmentation mask. In different stages of the network, convolution features and multi-scale maps are used in two boundary attention (BA) modules and two reverse attention (RA) modules to generate the final segmentation output. MFSNet, when evaluated on three publicly available datasets: $PH^2$, ISIC 2017, and HAM10000, outperforms state-of-the-art methods, justifying the reliability of the framework. The relevant codes for the proposed approach are accessible at https://github.com/Rohit-Kundu/MFSNet
Abstract:Segmentation is an essential requirement in medicine when digital images are used in illness diagnosis, especially, in posterior tasks as analysis and disease identification. An efficient segmentation of brain Magnetic Resonance Images (MRIs) is of prime concern to radiologists due to their poor illumination and other conditions related to de acquisition of the images. Thresholding is a popular method for segmentation that uses the histogram of an image to label different homogeneous groups of pixels into different classes. However, the computational cost increases exponentially according to the number of thresholds. In this paper, we perform the multi-level thresholding using an evolutionary metaheuristic. It is an improved version of the Harris Hawks Optimization (HHO) algorithm that combines the chaotic initialization and the concept of altruism. Further, for fitness assignment, we use a hybrid objective function where along with the cross-entropy minimization, we apply a new entropy function, and leverage weights to the two objective functions to form a new hybrid approach. The HHO was originally designed to solve numerical optimization problems. Earlier, the statistical results and comparisons have demonstrated that the HHO provides very promising results compared with well-established metaheuristic techniques. In this article, the altruism has been incorporated into the HHO algorithm to enhance its exploitation capabilities. We evaluate the proposed method over 10 benchmark images from the WBA database of the Harvard Medical School and 8 benchmark images from the Brainweb dataset using some standard evaluation metrics.
Abstract:Segmentation of handwritten document images into text lines and words is one of the most significant and challenging tasks in the development of a complete Optical Character Recognition (OCR) system. This paper addresses the automatic segmentation of text words directly from unconstrained Bangla handwritten document images. The popular Distance transform (DT) algorithm is applied for locating the outer boundary of the word images. This technique is free from generating the over-segmented words. A simple post-processing procedure is applied to isolate the under-segmented word images, if any. The proposed technique is tested on 50 random images taken from CMATERdb1.1.1 database. Satisfactory result is achieved with a segmentation accuracy of 91.88% which confirms the robustness of the proposed methodology.
Abstract:A considerable amount of success has been achieved in developing monolingual OCR systems for Indic scripts. But in a country like India, where multi-script scenario is prevalent, identifying scripts beforehand becomes obligatory. In this paper, we present the significance of Gabor wavelets filters in extracting directional energy and entropy distributions for 11 official handwritten scripts namely, Bangla, Devanagari, Gujarati, Gurumukhi, Kannada, Malayalam, Oriya, Tamil, Telugu, Urdu and Roman. The experimentation is conducted at block level based on a quad-tree decomposition approach and evaluated using six different well-known classifiers. Finally, the best identification accuracy of 96.86% has been achieved by Multi Layer Perceptron (MLP) classifier for 3-fold cross validation at level-2 decomposition. The results serve to establish the efficacy of the present approach to the classification of handwritten Indic scripts