Abstract:The data-intensive nature of supervised classification drives the interest of the researchers towards unsupervised approaches, especially for problems such as medical image segmentation, where labeled data is scarce. Building on the recent advancements of Vision transformers (ViT) in computer vision, we propose an unsupervised segmentation framework using a pre-trained Dino-ViT. In the proposed method, we leverage the inherent graph structure within the image to realize a significant performance gain for segmentation in medical images. For this, we introduce a modularity-based loss function coupled with a Graph Attention Network (GAT) to effectively capture the inherent graph topology within the image. Our method achieves state-of-the-art performance, even significantly surpassing or matching that of existing (semi)supervised technique such as MedSAM which is a Segment Anything Model in medical images. We demonstrate this using two challenging medical image datasets ISIC-2018 and CVC-ColonDB. This work underscores the potential of unsupervised approaches in advancing medical image analysis in scenarios where labeled data is scarce. The github repository of the code is available on [https://github.com/mudit-adityaja/UnSegMedGAT].
Abstract:The data-hungry approach of supervised classification drives the interest of the researchers toward unsupervised approaches, especially for problems such as medical image segmentation, where labeled data are difficult to get. Motivated by the recent success of Vision transformers (ViT) in various computer vision tasks, we propose an unsupervised segmentation framework with a pre-trained ViT. Moreover, by harnessing the graph structure inherent within the image, the proposed method achieves a notable performance in segmentation, especially in medical images. We further introduce a modularity-based loss function coupled with an Auto-Regressive Moving Average (ARMA) filter to capture the inherent graph topology within the image. Finally, we observe that employing Scaled Exponential Linear Unit (SELU) and SILU (Swish) activation functions within the proposed Graph Neural Network (GNN) architecture enhances the performance of segmentation. The proposed method provides state-of-the-art performance (even comparable to supervised methods) on benchmark image segmentation datasets such as ECSSD, DUTS, and CUB, as well as challenging medical image segmentation datasets such as KVASIR, CVC-ClinicDB, ISIC-2018. The github repository of the code is available on \url{https://github.com/ksgr5566/UnSeGArmaNet}.
Abstract:Recent developments in self-supervised learning (SSL) have made it possible to learn data representations without the need for annotations. Inspired by the non-contrastive SSL approach (SimSiam), we introduce a novel framework SIMSAM to compute the Semantic Affinity Matrix, which is significant for unsupervised image segmentation. Given an image, SIMSAM first extracts features using pre-trained DINO-ViT, then projects the features to predict the correlations of dense features in a non-contrastive way. We show applications of the Semantic Affinity Matrix in object segmentation and semantic segmentation tasks. Our code is available at https://github.com/chandagrover/SimSAM.
Abstract:Image segmentation, the process of partitioning an image into meaningful regions, plays a pivotal role in computer vision and medical imaging applications. Unsupervised segmentation, particularly in the absence of labeled data, remains a challenging task due to the inter-class similarity and variations in intensity and resolution. In this study, we extract high-level features of the input image using pretrained vision transformer. Subsequently, the proposed method leverages the underlying graph structures of the images, seeking to discover and delineate meaningful boundaries using graph neural networks and modularity based optimization criteria without relying on pre-labeled training data. Experimental results on benchmark datasets demonstrate the effectiveness and versatility of the proposed approach, showcasing competitive performance compared to the state-of-the-art unsupervised segmentation methods. This research contributes to the broader field of unsupervised medical imaging and computer vision by presenting an innovative methodology for image segmentation that aligns with real-world challenges. The proposed method holds promise for diverse applications, including medical imaging, remote sensing, and object recognition, where labeled data may be scarce or unavailable. The github repository of the code is available on [https://github.com/ksgr5566/unseggnet]
Abstract:Cancelable Biometric is a challenging research field in which security of an original biometric image is ensured by transforming the original biometric into another irreversible domain. Several approaches have been suggested in literature for generating cancelable biometric templates. In this paper, two novel and simple cancelable biometric template generation methods based on Random Walk (CBRW) have been proposed. By employing random walk and other steps given in the proposed two algorithms viz. CBRW-BitXOR and CBRW-BitCMP, the original biometric is transformed into a cancellable template. The performance of the proposed methods is compared with other state-of-the-art methods. Experiments have been performed on eight publicly available gray and color datasets i.e. CP (ear) (gray and color), UTIRIS (iris) (gray and color), ORL (face) (gray), IIT Delhi (iris) (gray and color), and AR (face) (color). Performance of the generated templates is measured in terms of Correlation Coefficient (Cr), Root Mean Square Error (RMSE), Peak Signal to Noise Ratio (PSNR), Structural Similarity (SSIM), Mean Absolute Error (MAE), Number of Pixel Change Rate (NPCR), and Unified Average Changing Intensity (UACI). By experimental results, it has been proved that proposed methods are superior than other state-of-the-art methods in qualitative as well as quantitative analysis. Furthermore, CBRW performs better on both gray as well as color images.
Abstract:Cancelable Biometric is repetitive distortion embedded in original Biometric image for keeping it secure from unauthorized access. In this paper, we have generated Cancelable Biometric templates with Reverse Boolean XOR technique. Three different methods have been proposed for generation of Cancelable Biometric templates based on Visual Secret Sharing scheme. In each method, one Secret image and n-1 Cover images are used as: (M1) One original Biometric image (Secret) with n- 1 randomly chosen Gray Cover images (M2) One original Secret image with n-1 Cover images, which are Randomly Permuted version of the original Secret image (M3) One Secret image with n-1 Cover images, both Secret image and Cover images are Randomly Permuted version of original Biometric image. Experiment works have performed on publicly available ORL Face database and IIT Delhi Iris database. The performance of the proposed methods is compared in terms of Co-relation Coefficient (Cr), Mean Square Error (MSE), Mean Absolute Error (MAE), Structural Similarity (SSIM), Peak Signal to Noise Ratio (PSNR), Number of Pixel Change Rate (NPCR), and Unified Average Changing Intensity (UACI). It is found that among the three proposed method, M3 generates good quality Cancelable templates and gives best performance in terms of quality. M3 is also better in quantitative terms on ORL dataset while M2 and M3 are comparable on IIT Delhi Iris dataset.