Abstract:Prostate cancer being one of the frequently diagnosed malignancy in men, the rising demand for biopsies places a severe workload on pathologists. The grading procedure is tedious and subjective, motivating the development of automated systems. Although deep learning has made inroads in terms of performance, its limited interpretability poses challenges for widespread adoption in high-stake applications like medicine. Existing interpretability techniques for prostate cancer classifiers provide a coarse explanation but do not reveal why the highlighted regions matter. In this scenario, we propose a novel prototype-based weakly-supervised framework for an interpretable grading of prostate cancer from histopathology images. These networks can prove to be more trustworthy since their explicit reasoning procedure mirrors the workflow of a pathologist in comparing suspicious regions with clinically validated examples. The network is initially pre-trained at patch-level to learn robust prototypical features associated with each grade. In order to adapt it to a weakly-supervised setup for prostate cancer grading, the network is fine-tuned with a new prototype-aware loss function. Finally, a new attention-based dynamic pruning mechanism is introduced to handle inter-sample heterogeneity, while selectively emphasizing relevant prototypes for optimal performance. Extensive validation on the benchmark PANDA and SICAP datasets confirms that the framework can serve as a reliable assistive tool for pathologists in their routine diagnostic workflows.
Abstract:Diabetic Retinopathy (DR) requires timely screening to prevent irreversible vision loss. However, its early detection remains a significant challenge since often the subtle pathological manifestations (lesions) get overlooked due to insufficient annotation. Existing literature primarily focuses on image-level supervision, weakly-supervised localization, and clustering-based representation learning, which fail to systematically annotate unlabeled lesion region(s) for refining the dataset. Expert-driven lesion annotation is labor-intensive and often incomplete, limiting the performance of deep learning models. We introduce Similarity-based Annotation via Feature-space Ensemble (SAFE), a two-stage framework that unifies weak supervision, contrastive learning, and patch-wise embedding inference, to systematically expand sparse annotations in the pathology. SAFE preserves fine-grained details of the lesion(s) under partial clinical supervision. In the first stage, a dual-arm Patch Embedding Network learns semantically structured, class-discriminative embeddings from expert annotated patches. Next, an ensemble of independent embedding spaces extrapolates labels to the unannotated regions based on spatial and semantic proximity. An abstention mechanism ensures trade-off between highly reliable annotation and noisy coverage. Experimental results demonstrate reliable separation of healthy and diseased patches, achieving upto 0.9886 accuracy. The annotation generated from SAFE substantially improves downstream tasks such as DR classification, demonstrating a substantial increase in F1-score of the diseased class and a performance gain as high as 0.545 in Area Under the Precision-Recall Curve (AUPRC). Qualitative analysis, with explainability, confirms that SAFE focuses on clinically relevant lesion patterns; and is further validated by ophthalmologists.
Abstract:Prostate cancer grading using the ISUP system (International Society of Urological Pathology) for treatment decisions is highly subjective and requires considerable expertise. Despite advances in computer-aided diagnosis systems, few have handled efficient ISUP grading on Whole Slide Images (WSIs) of prostate biopsies based only on slide-level labels. Some of the general challenges include managing gigapixel WSIs, obtaining patch-level annotations, and dealing with stain variability across centers. One of the main task-specific challenges faced by deep learning in ISUP grading, is the learning of patch-level features of Gleason patterns (GPs) based only on their slide labels. In this scenario, an efficient framework for ISUP grading is developed. The proposed TSOR is based on a novel Task-specific Self-supervised learning (SSL) model, which is fine-tuned using Ordinal Regression. Since the diversity of training samples plays a crucial role in SSL, a patch-level dataset is created to be relatively balanced w.r.t. the Gleason grades (GGs). This balanced dataset is used for pre-training, so that the model can effectively learn stain-agnostic features of the GP for better generalization. In medical image grading, it is desirable that misclassifications be as close as possible to the actual grade. From this perspective, the model is then fine-tuned for the task of ISUP grading using an ordinal regression-based approach. Experimental results on the most extensive multicenter prostate biopsies dataset (PANDA challenge), as well as the SICAP dataset, demonstrate the effectiveness of this novel framework compared to state-of-the-art methods.
Abstract:The prevalence of ocular illnesses is growing globally, presenting a substantial public health challenge. Early detection and timely intervention are crucial for averting visual impairment and enhancing patient prognosis. This research introduces a new framework called Class Extension with Limited Data (CELD) to train a classifier to categorize retinal fundus images. The classifier is initially trained to identify relevant features concerning Healthy and Diabetic Retinopathy (DR) classes and later fine-tuned to adapt to the task of classifying the input images into three classes: Healthy, DR, and Glaucoma. This strategy allows the model to gradually enhance its classification capabilities, which is beneficial in situations where there are only a limited number of labeled datasets available. Perturbation methods are also used to identify the input image characteristics responsible for influencing the models decision-making process. We achieve an overall accuracy of 91% on publicly available datasets.
Abstract:The advancement of developing efficient medical image segmentation has evolved from initial dependence on Convolutional Neural Networks (CNNs) to the present investigation of hybrid models that combine CNNs with Vision Transformers. Furthermore, there is an increasing focus on creating architectures that are both high-performing in medical image segmentation tasks and computationally efficient to be deployed on systems with limited resources. Although transformers have several advantages like capturing global dependencies in the input data, they face challenges such as high computational and memory complexity. This paper investigates the integration of CNNs and Vision Extended Long Short-Term Memory (Vision-xLSTM) models by introducing a novel approach called UVixLSTM. The Vision-xLSTM blocks captures temporal and global relationships within the patches extracted from the CNN feature maps. The convolutional feature reconstruction path upsamples the output volume from the Vision-xLSTM blocks to produce the segmentation output. Our primary objective is to propose that Vision-xLSTM forms a reliable backbone for medical image segmentation tasks, offering excellent segmentation performance and reduced computational complexity. UVixLSTM exhibits superior performance compared to state-of-the-art networks on the publicly-available Synapse dataset. Code is available at: https://github.com/duttapallabi2907/UVixLSTM




Abstract:The COVID-19 pandemic, with its multiple variants, has placed immense pressure on the global healthcare system. An early effective screening and grading become imperative towards optimizing the limited available resources of the medical facilities. Computed tomography (CT) provides a significant non-invasive screening mechanism for COVID-19 infection. An automated segmentation of the infected volumes in lung CT is expected to significantly aid in the diagnosis and care of patients. However, an accurate demarcation of lesions remains problematic due to their irregular structure and location(s) within the lung. A novel deep learning architecture, Mixed Attention Deeply Supervised Network (MiADS-Net), is proposed for delineating the infected regions of the lung from CT images. Incorporating dilated convolutions with varying dilation rates, into a mixed attention framework, allows capture of multi-scale features towards improved segmentation of lesions having different sizes and textures. Mixed attention helps prioritise relevant feature maps to be probed, along with those regions containing crucial information within these maps. Deep supervision facilitates discovery of robust and discriminatory characteristics in the hidden layers at shallower levels, while overcoming the vanishing gradient. This is followed by estimating the severity of the disease, based on the ratio of the area of infected region in each lung with respect to its entire volume. Experimental results, on three publicly available datasets, indicate that the MiADS-Net outperforms several state-of-the-art architectures in the COVID-19 lesion segmentation task; particularly in defining structures involving complex geometries.
Abstract:The devastation caused by the coronavirus pandemic makes it imperative to design automated techniques for a fast and accurate detection. We propose a novel non-invasive tool, using deep learning and imaging, for delineating COVID-19 infection in lungs. The Ensembling Attention-based Multi-scaled Convolution network (EAMC), employing Leave-One-Patient-Out (LOPO) training, exhibits high sensitivity and precision in outlining infected regions along with assessment of severity. The Attention module combines contextual with local information, at multiple scales, for accurate segmentation. Ensemble learning integrates heterogeneity of decision through different base classifiers. The superiority of EAMC, even with severe class imbalance, is established through comparison with existing state-of-the-art learning models over four publicly-available COVID-19 datasets. The results are suggestive of the relevance of deep learning in providing assistive intelligence to medical practitioners, when they are overburdened with patients as in pandemics. Its clinical significance lies in its unprecedented scope in providing low-cost decision-making for patients lacking specialized healthcare at remote locations.
Abstract:Automated delineation of COVID-19 lesions from lung CT scans aids the diagnosis and prognosis for patients. The asymmetric shapes and positioning of the infected regions make the task extremely difficult. Capturing information at multiple scales will assist in deciphering features, at global and local levels, to encompass lesions of variable size and texture. We introduce the Full-scale Deeply Supervised Attention Network (FuDSA-Net), for efficient segmentation of corona-infected lung areas in CT images. The model considers activation responses from all levels of the encoding path, encompassing multi-scalar features acquired at different levels of the network. This helps segment target regions (lesions) of varying shape, size and contrast. Incorporation of the entire gamut of multi-scalar characteristics into the novel attention mechanism helps prioritize the selection of activation responses and locations containing useful information. Determining robust and discriminatory features along the decoder path is facilitated with deep supervision. Connections in the decoder arm are remodeled to handle the issue of vanishing gradient. As observed from the experimental results, FuDSA-Net surpasses other state-of-the-art architectures; especially, when it comes to characterizing complicated geometries of the lesions.




Abstract:Rational approximations are introduced and studied in granular graded sets and generalizations thereof by the first author in recent research papers. The concept of rationality is determined by related ontologies and coherence between granularity, parthood perspective and approximations used in the context. In addition, a framework is introduced by her in the mentioned paper(s). Granular approximations constructed as per the procedures of VPRS are likely to be more rational than those constructed from a classical perspective under certain conditions. This may continue to hold for some generalizations of the former; however, a formal characterization of such conditions is not available in the previously published literature. In this research, theoretical aspects of the problem are critically examined, uniform generalizations of granular VPRS are introduced, new connections with granular graded rough sets are proved, appropriate concepts of substantial parthood are introduced, and their extent of compatibility with the framework is accessed. Furthermore, meta applications to cluster validation, image segmentation and dynamic sorting are invented. Basic assumptions made are explained, and additional examples are constructed for readability.




Abstract:Prediction of Overall Survival (OS) of brain cancer patients from multi-modal MRI is a challenging field of research. Most of the existing literature on survival prediction is based on Radiomic features, which does not consider either non-biological factors or the functional neurological status of the patient(s). Besides, the selection of an appropriate cut-off for survival and the presence of censored data create further problems. Application of deep learning models for OS prediction is also limited due to the lack of large annotated publicly available datasets. In this scenario we analyse the potential of two novel neuroimaging feature families, extracted from brain parcellation atlases and spatial habitats, along with classical radiomic and geometric features; to study their combined predictive power for analysing overall survival. A cross validation strategy with grid search is proposed to simultaneously select and evaluate the most predictive feature subset based on its predictive power. A Cox Proportional Hazard (CoxPH) model is employed for univariate feature selection, followed by the prediction of patient-specific survival functions by three multivariate parsimonious models viz. Coxnet, Random survival forests (RSF) and Survival SVM (SSVM). The brain cancer MRI data used for this research was taken from two open-access collections TCGA-GBM and TCGA-LGG available from The Cancer Imaging Archive (TCIA). Corresponding survival data for each patient was downloaded from The Cancer Genome Atlas (TCGA). A high cross validation $C-index$ score of $0.82\pm.10$ was achieved using RSF with the best $24$ selected features. Age was found to be the most important biological predictor. There were $9$, $6$, $6$ and $2$ features selected from the parcellation, habitat, radiomic and region-based feature groups respectively.