Abstract:Traditional deep learning methods in medical imaging often focus solely on segmentation or classification, limiting their ability to leverage shared information. Multi-task learning (MTL) addresses this by combining both tasks through shared representations but often struggles to balance local spatial features for segmentation and global semantic features for classification, leading to suboptimal performance. In this paper, we propose a simple yet effective UNet-based MTL model, where features extracted by the encoder are used to predict classification labels, while the decoder produces the segmentation mask. The model introduces an advanced encoder incorporating a novel ResFormer block that integrates local context from convolutional feature extraction with long-range dependencies modeled by the Transformer. This design captures broader contextual relationships and fine-grained details, improving classification and segmentation accuracy. To enhance classification performance, multi-scale features from different encoder levels are combined to leverage the hierarchical representation of the input image. For segmentation, the features passed to the decoder via skip connections are refined using a novel dilated feature enhancement (DFE) module, which captures information at different scales through three parallel convolution branches with varying dilation rates. This allows the decoder to detect lesions of varying sizes with greater accuracy. Experimental results across multiple medical datasets confirm the superior performance of our model in both segmentation and classification tasks, compared to state-of-the-art single-task and multi-task learning methods.
Abstract:Diabetic Retinopathy (DR) is a primary cause of blindness, necessitating early detection and diagnosis. This paper focuses on referable DR classification to enhance the applicability of the proposed method in clinical practice. We develop an advanced cross-learning DR classification method leveraging transfer learning and cross-attention mechanisms. The proposed method employs the Swin U-Net architecture to segment lesion maps from DR fundus images. The Swin U-Net segmentation model, enriched with DR lesion insights, is transferred to generate a lesion map. Both the fundus image and its segmented lesion map are used as complementary inputs for the classification model. A cross-attention mechanism is deployed to improve the model's ability to capture fine-grained details from the input pairs. Our experiments, utilizing two public datasets, FGADR and EyePACS, demonstrate a superior accuracy of 94.6%, surpassing current state-of-the-art methods by 4.4%. To this end, we aim for the proposed method to be seamlessly integrated into clinical workflows, enhancing accuracy and efficiency in identifying referable DR.
Abstract:Unmanned Ariel Vehicle (UAV) services with 5G connectivity is an emerging field with numerous applications. Operator-controlled UAV flights and manual static flight configurations are major limitations for the wide adoption of scalability of UAV services. Several services depend on excellent UAV connectivity with a cellular network and maintaining it is challenging in predetermined flight paths. This paper addresses these limitations by proposing a Deep Reinforcement Learning (DRL) framework for UAV path planning with assured connectivity (DUPAC). During UAV flight, DUPAC determines the best route from a defined source to the destination in terms of distance and signal quality. The viability and performance of DUPAC are evaluated under simulated real-world urban scenarios using the Unity framework. The results confirm that DUPAC achieves an autonomous UAV flight path similar to base method with only 2% increment while maintaining an average 9% better connection quality throughout the flight.
Abstract:Mobile traffic data in urban regions shows differentiated patterns during different hours of the day. The exploitation of these patterns enables highly accurate mobile traffic prediction for proactive network management. However, recent Deep Learning (DL) driven studies have only exploited spatiotemporal features and have ignored the geographical correlations, causing high complexity and erroneous mobile traffic predictions. This paper addresses these limitations by proposing an enhanced mobile traffic prediction scheme that combines the clustering strategy of daily mobile traffic peak time and novel multi Temporal Convolutional Network with a Long Short Term Memory (multi TCN-LSTM) model. The mobile network cells that exhibit peak traffic during the same hour of the day are clustered together. Our experiments on large-scale real-world mobile traffic data show up to 28% performance improvement compared to state-of-the-art studies, which confirms the efficacy and viability of the proposed approach.