Abstract:Cardiac auscultation, an integral tool in diagnosing cardiovascular diseases (CVDs), often relies on the subjective interpretation of clinicians, presenting a limitation in consistency and accuracy. Addressing this, we introduce the BUET Multi-disease Heart Sound (BMD-HS) dataset - a comprehensive and meticulously curated collection of heart sound recordings. This dataset, encompassing 864 recordings across five distinct classes of common heart sounds, represents a broad spectrum of valvular heart diseases, with a focus on diagnostically challenging cases. The standout feature of the BMD-HS dataset is its innovative multi-label annotation system, which captures a diverse range of diseases and unique disease states. This system significantly enhances the dataset's utility for developing advanced machine learning models in automated heart sound classification and diagnosis. By bridging the gap between traditional auscultation practices and contemporary data-driven diagnostic methods, the BMD-HS dataset is poised to revolutionize CVD diagnosis and management, providing an invaluable resource for the advancement of cardiac health research. The dataset is publicly available at this link: https://github.com/mHealthBuet/BMD-HS-Dataset.
Abstract:Objective: Lung auscultation is a valuable tool in diagnosing and monitoring various respiratory diseases. However, lung sounds (LS) are significantly affected by numerous sources of contamination, especially when recorded in real-world clinical settings. Conventional denoising models prove impractical for LS denoising, primarily owing to spectral overlap complexities arising from diverse noise sources. To address this issue, we propose a specialized deep-learning model (Uformer) for lung sound denoising. Methods: The proposed Uformer model is constituted of three modules: a Convolutional Neural Network (CNN) encoder module, dedicated to extracting latent features; a Transformer encoder module, employed to further enhance the encoding of unique LS features and effectively capture intricate long-range dependencies; and a CNN decoder module, employed to generate the denoised signals. An ablation study was performed in order to find the most optimal architecture. Results: The performance of the proposed Uformer model was evaluated on lung sounds induced with different types of synthetic and real-world noises. Lung sound signals of -12 dB to 15 dB signal-to-noise ratio (SNR) were considered in testing experiments. The proposed model showed an average SNR improvement of 16.51 dB when evaluated with -12 dB LS signals. Our end-to-end model, with an average SNR improvement of 19.31 dB, outperforms the existing model when evaluated with ambient noise and fewer parameters. Conclusion: Based on the qualitative and quantitative findings in this study, it can be stated that Uformer is robust and generalized to be used in assisting the monitoring of respiratory conditions.
Abstract:Ocular Toxoplasmosis (OT), is a common eye infection caused by T. gondii that can cause vision problems. Diagnosis is typically done through a clinical examination and imaging, but these methods can be complicated and costly, requiring trained personnel. To address this issue, we have created a benchmark study that evaluates the effectiveness of existing pre-trained networks using transfer learning techniques to detect OT from fundus images. Furthermore, we have also analysed the performance of transfer-learning based segmentation networks to segment lesions in the images. This research seeks to provide a guide for future researchers looking to utilise DL techniques and develop a cheap, automated, easy-to-use, and accurate diagnostic method. We have performed in-depth analysis of different feature extraction techniques in order to find the most optimal one for OT classification and segmentation of lesions. For classification tasks, we have evaluated pre-trained models such as VGG16, MobileNetV2, InceptionV3, ResNet50, and DenseNet121 models. Among them, MobileNetV2 outperformed all other models in terms of Accuracy (Acc), Recall, and F1 Score outperforming the second-best model, InceptionV3 by 0.7% higher Acc. However, DenseNet121 achieved the best result in terms of Precision, which was 0.1% higher than MobileNetv2. For the segmentation task, this work has exploited U-Net architecture. In order to utilize transfer learning the encoder block of the traditional U-Net was replaced by MobileNetV2, InceptionV3, ResNet34, and VGG16 to evaluate different architectures moreover two different two different loss functions (Dice loss and Jaccard loss) were exploited in order to find the most optimal one. The MobileNetV2/U-Net outperformed ResNet34 by 0.5% and 2.1% in terms of Acc and Dice Score, respectively when Jaccard loss function is employed during the training.
Abstract:Cardiovascular diseases (CVDs) can be effectively treated when detected early, reducing mortality rates significantly. Traditionally, phonocardiogram (PCG) signals have been utilized for detecting cardiovascular disease due to their cost-effectiveness and simplicity. Nevertheless, various environmental and physiological noises frequently affect the PCG signals, compromising their essential distinctive characteristics. The prevalence of this issue in overcrowded and resource-constrained hospitals can compromise the accuracy of medical diagnoses. Therefore, this study aims to discover the optimal transformation method for detecting CVDs using noisy heart sound signals and propose a noise robust network to improve the CVDs classification performance.For the identification of the optimal transformation method for noisy heart sound data mel-frequency cepstral coefficients (MFCCs), short-time Fourier transform (STFT), constant-Q nonstationary Gabor transform (CQT) and continuous wavelet transform (CWT) has been used with VGG16. Furthermore, we propose a novel convolutional recurrent neural network (CRNN) architecture called noise robust cardio net (NRC-Net), which is a lightweight model to classify mitral regurgitation, aortic stenosis, mitral stenosis, mitral valve prolapse, and normal heart sounds using PCG signals contaminated with respiratory and random noises. An attention block is included to extract important temporal and spatial features from the noisy corrupted heart sound.The results of this study indicate that,CWT is the optimal transformation method for noisy heart sound signals. When evaluated on the GitHub heart sound dataset, CWT demonstrates an accuracy of 95.69% for VGG16, which is 1.95% better than the second-best CQT transformation technique. Moreover, our proposed NRC-Net with CWT obtained an accuracy of 97.4%, which is 1.71% higher than the VGG16.
Abstract:Lung cancer is a leading cause of cancer-related deaths worldwide, and early detection is crucial for improving patient outcomes. Nevertheless, early diagnosis of cancer is a major challenge, particularly in low-resource settings where access to medical resources and trained radiologists is limited. The objective of this study is to propose an automated end-to-end deep learning-based framework for the early detection and classification of lung nodules, specifically for low-resource settings. The proposed framework consists of three stages: lung segmentation using a modified 3D U-Net named 3D Res-U-Net, nodule detection using YOLO-v5, and classification with a Vision Transformer-based architecture. We evaluated the proposed framework on a publicly available dataset, LUNA16. The proposed framework's performance was measured using the respective domain's evaluation matrices. The proposed framework achieved a 98.82% lung segmentation dice score while detecting the lung nodule with 0.76 mAP@50 from the segmented lung, at a low false-positive rate. The performance of both networks of the proposed framework was compared with other studies and found to outperform them regarding segmentation and detection accuracy. Additionally, our proposed Vision transformer network obtained an accuracy of 93.57%, which is 1.21 higher than the state-of-the-art networks. Our proposed end-to-end deep learning-based framework can effectively segment lungs, and detect and classify lung nodules, specifically in low-resource settings with limited access to radiologists. The proposed framework outperforms existing studies regarding all the respective evaluation metrics. The proposed framework can potentially improve the accuracy and efficiency of lung cancer screening in low-resource settings, ultimately leading to better patient outcomes.