Abstract:Multimodal fusion learning has shown significant promise in classifying various diseases such as skin cancer and brain tumors. However, existing methods face three key limitations. First, they often lack generalizability to other diagnosis tasks due to their focus on a particular disease. Second, they do not fully leverage multiple health records from diverse modalities to learn robust complementary information. And finally, they typically rely on a single attention mechanism, missing the benefits of multiple attention strategies within and across various modalities. To address these issues, this paper proposes a dual robust information fusion attention mechanism (DRIFA) that leverages two attention modules, i.e. multi-branch fusion attention module and the multimodal information fusion attention module. DRIFA can be integrated with any deep neural network, forming a multimodal fusion learning framework denoted as DRIFA-Net. We show that the multi-branch fusion attention of DRIFA learns enhanced representations for each modality, such as dermoscopy, pap smear, MRI, and CT-scan, whereas multimodal information fusion attention module learns more refined multimodal shared representations, improving the network's generalization across multiple tasks and enhancing overall performance. Additionally, to estimate the uncertainty of DRIFA-Net predictions, we have employed an ensemble Monte Carlo dropout strategy. Extensive experiments on five publicly available datasets with diverse modalities demonstrate that our approach consistently outperforms state-of-the-art methods. The code is available at https://github.com/misti1203/DRIFA-Net.
Abstract:The First-Mile Last-Mile (FMLM) connectivity is crucial for improving public transit accessibility and efficiency, particularly in sprawling suburban regions where traditional fixed-route transit systems are often inadequate. Autonomous on-Demand Shuttles (AODS) hold a promising option for FMLM connections due to their cost-effectiveness and improved safety features, thereby enhancing user convenience and reducing reliance on personal vehicles. A critical issue in AODS service design is the optimization of travel paths, for which realistic traffic network assignment combined with optimal routing offers a viable solution. In this study, we have designed an AODS controller that integrates a mesoscopic simulation-based dynamic traffic assignment model with a greedy insertion heuristics approach to optimize the travel routes of the shuttles. The controller also considers the charging infrastructure/strategies and the impact of the shuttles on regular traffic flow for routes and fleet-size planning. The controller is implemented in Aimsun traffic simulator considering Lake Nona in Orlando, Florida as a case study. We show that, under the present demand based on 1% of total trips as transit riders, a fleet of 3 autonomous shuttles can serve about 80% of FMLM trip requests on-demand basis with an average waiting time below 4 minutes. Additional power sources have significant effect on service quality as the inactive waiting time for charging would increase the fleet size. We also show that low-speed autonomous shuttles would have negligible impact on regular vehicle flow, making them suitable for suburban areas. These findings have important implications for sustainable urban planning and public transit operations.
Abstract:In this paper, we propose a solution which uses state-of-the-art techniques in Deep Learning to tackle the problem of Bengali Handwritten Character Recognition ( HCR ). Our method uses lesser iterations to train than most other comparable methods. We employ Transfer Learning on ResNet 50, a state-of-the-art deep Convolutional Neural Network Model, pretrained on ImageNet dataset. We also use other techniques like a modified version of One Cycle Policy, varying the input image sizes etc. to ensure that our training occurs fast. We use the BanglaLekha-Isolated Dataset for evaluation of our technique which consists of 84 classes (50 Basic, 10 Numerals and 24 Compound Characters). We are able to achieve 96.12% accuracy in just 47 epochs on BanglaLekha-Isolated dataset. When comparing our method with that of other researchers, considering number of classes and without using Ensemble Learning, the proposed solution achieves state of the art result for Handwritten Bengali Character Recognition. Code and weight files are available at https://github.com/swagato-c/bangla-hwcr-present.
Abstract:Tumor segmentation from magnetic resonance imaging (MRI) data is an important but time consuming manual task performed by medical experts. Automating this process is a challenging task because of the high diversity in the appearance of tumor tissues among different patients and in many cases similarity with the normal tissues. MRI is an advanced medical imaging technique providing rich information about the human soft-tissue anatomy. There are different brain tumor detection and segmentation methods to detect and segment a brain tumor from MRI images. These detection and segmentation approaches are reviewed with an importance placed on enlightening the advantages and drawbacks of these methods for brain tumor detection and segmentation. The use of MRI image detection and segmentation in different procedures are also described. Here a brief review of different segmentation for detection of brain tumor from MRI of brain has been discussed.
Abstract:The detection of weapons concealed underneath a person cloths is very much important to the improvement of the security of the public as well as the safety of public assets like airports, buildings and railway stations etc.
Abstract:Image binarization is the process of separation of pixel values into two groups, white as background and black as foreground. Thresholding plays a major in binarization of images. Thresholding can be categorized into global thresholding and local thresholding. In images with uniform contrast distribution of background and foreground like document images, global thresholding is more appropriate. In degraded document images, where considerable background noise or variation in contrast and illumination exists, there exists many pixels that cannot be easily classified as foreground or background. In such cases, binarization with local thresholding is more appropriate. This paper describes a locally adaptive thresholding technique that removes background by using local mean and mean deviation. Normally the local mean computational time depends on the window size. Our technique uses integral sum image as a prior processing to calculate local mean. It does not involve calculations of standard deviations as in other local adaptive techniques. This along with the fact that calculations of mean is independent of window size speed up the process as compared to other local thresholding techniques.