Abstract:Knee osteoporosis weakens the bone tissue in the knee joint, increasing fracture risk. Early detection through X-ray images enables timely intervention and improved patient outcomes. While some researchers have focused on diagnosing knee osteoporosis through manual radiology evaluation and traditional machine learning using hand-crafted features, these methods often struggle with performance and efficiency due to reliance on manual feature extraction and subjective interpretation. In this study, we propose a computer-aided diagnosis (CAD) system for knee osteoporosis, combining transfer learning with stacked feature enhancement deep learning blocks. Initially, knee X-ray images are preprocessed, and features are extracted using a pre-trained Convolutional Neural Network (CNN). These features are then enhanced through five sequential Conv-RELU-MaxPooling blocks. The Conv2D layers detect low-level features, while the ReLU activations introduce non-linearity, allowing the network to learn complex patterns. MaxPooling layers down-sample the features, retaining the most important spatial information. This sequential processing enables the model to capture complex, high-level features related to bone structure, joint deformation, and osteoporotic markers. The enhanced features are passed through a classification module to differentiate between healthy and osteoporotic knee conditions. Extensive experiments on three individual datasets and a combined dataset demonstrate that our model achieves 97.32%, 98.24%, 97.27%, and 98.00% accuracy for OKX Kaggle Binary, KXO-Mendeley Multi-Class, OKX Kaggle Multi-Class, and the combined dataset, respectively, showing an improvement of around 2% over existing methods.
Abstract:Heart disease is a leading cause of premature death worldwide, particularly among middle-aged and older adults, with men experiencing a higher prevalence. According to the World Health Organization (WHO), non-communicable diseases, including heart disease, account for 25\% (17.9 million) of global deaths, with over 43,204 annual fatalities in Bangladesh. However, the development of heart disease detection (HDD) systems tailored to the Bangladeshi population remains underexplored due to the lack of benchmark datasets and reliance on manual or limited-data approaches. This study addresses these challenges by introducing new, ethically sourced HDD dataset, BIG-Dataset and CD dataset which incorporates comprehensive data on symptoms, examination techniques, and risk factors. Using advanced machine learning techniques, including Logistic Regression and Random Forest, we achieved a remarkable testing accuracy of up to 96.6\% with Random Forest. The proposed AI-driven system integrates these models and datasets to provide real-time, accurate diagnostics and personalized healthcare recommendations. By leveraging structured datasets and state-of-the-art machine learning algorithms, this research offers an innovative solution for scalable and effective heart disease detection, with the potential to reduce mortality rates and improve clinical outcomes.
Abstract:Post-traumatic stress disorder (PTSD) is a significant mental health challenge that affects individuals exposed to traumatic events. Early detection and effective intervention for PTSD are crucial, as it can lead to long-term psychological distress if untreated. Accurate detection of PTSD is essential for timely and targeted mental health interventions, especially in disaster-affected populations. Existing research has explored machine learning approaches for classifying PTSD, but many face limitations in terms of model performance and generalizability. To address these issues, we implemented a comprehensive preprocessing pipeline. This included data cleaning, missing value treatment using the SimpleImputer, label encoding of categorical variables, data augmentation using SMOTE to balance the dataset, and feature scaling with StandardScaler. The dataset was split into 80\% training and 20\% testing. We developed an ensemble model using a majority voting technique among several classifiers, including Logistic Regression, Support Vector Machines (SVM), Random Forest, XGBoost, LightGBM, and a customized Artificial Neural Network (ANN). The ensemble model achieved an accuracy of 96.76\% with a benchmark dataset, significantly outperforming individual models. The proposed method's advantages include improved robustness through the combination of multiple models, enhanced ability to generalize across diverse data points, and increased accuracy in detecting PTSD. Additionally, the use of SMOTE for data augmentation ensured better handling of imbalanced datasets, leading to more reliable predictions. The proposed approach offers valuable insights for policymakers and healthcare providers by leveraging predictive analytics to address mental health issues in vulnerable populations, particularly those affected by disasters.
Abstract:In recent years, advanced artificial intelligence technologies, such as ChatGPT, have significantly impacted various fields, including education and research. Developed by OpenAI, ChatGPT is a powerful language model that presents numerous opportunities for students and educators. It offers personalized feedback, enhances accessibility, enables interactive conversations, assists with lesson preparation and evaluation, and introduces new methods for teaching complex subjects. However, ChatGPT also poses challenges to traditional education and research systems. These challenges include the risk of cheating on online exams, the generation of human-like text that may compromise academic integrity, a potential decline in critical thinking skills, and difficulties in assessing the reliability of information generated by AI. This study examines both the opportunities and challenges ChatGPT brings to education from the perspectives of students and educators. Specifically, it explores the role of ChatGPT in helping students develop their subjective skills. To demonstrate its effectiveness, we conducted several subjective experiments using ChatGPT, such as generating solutions from subjective problem descriptions. Additionally, surveys were conducted with students and teachers to gather insights into how ChatGPT supports subjective learning and teaching. The results and analysis of these surveys are presented to highlight the impact of ChatGPT in this context.
Abstract:Fire hazards are extremely dangerous, particularly in sectors such as the transportation industry, where political unrest increases the likelihood of their occurrence. By employing IP cameras to facilitate the setup of fire detection systems on transport vehicles, losses from fire events may be prevented proactively. However, the development of lightweight fire detection models is required due to the computational constraints of the embedded systems within these cameras. We introduce FireLite, a low-parameter convolutional neural network (CNN) designed for quick fire detection in contexts with limited resources, in response to this difficulty. With an accuracy of 98.77\%, our model -- which has just 34,978 trainable parameters achieves remarkable performance numbers. It also shows a validation loss of 8.74 and peaks at 98.77 for precision, recall, and F1-score measures. Because of its precision and efficiency, FireLite is a promising solution for fire detection in resource-constrained environments.
Abstract:Weakly supervised video anomaly detection (WS-VAD) is a crucial area in computer vision for developing intelligent surveillance systems. This system uses three feature streams: RGB video, optical flow, and audio signals, where each stream extracts complementary spatial and temporal features using an enhanced attention module to improve detection accuracy and robustness. In the first stream, we employed an attention-based, multi-stage feature enhancement approach to improve spatial and temporal features from the RGB video where the first stage consists of a ViT-based CLIP module, with top-k features concatenated in parallel with I3D and Temporal Contextual Aggregation (TCA) based rich spatiotemporal features. The second stage effectively captures temporal dependencies using the Uncertainty-Regulated Dual Memory Units (UR-DMU) model, which learns representations of normal and abnormal data simultaneously, and the third stage is employed to select the most relevant spatiotemporal features. The second stream extracted enhanced attention-based spatiotemporal features from the flow data modality-based feature by taking advantage of the integration of the deep learning and attention module. The audio stream captures auditory cues using an attention module integrated with the VGGish model, aiming to detect anomalies based on sound patterns. These streams enrich the model by incorporating motion and audio signals often indicative of abnormal events undetectable through visual analysis alone. The concatenation of the multimodal fusion leverages the strengths of each modality, resulting in a comprehensive feature set that significantly improves anomaly detection accuracy and robustness across three datasets. The extensive experiment and high performance with the three benchmark datasets proved the effectiveness of the proposed system over the existing state-of-the-art system.
Abstract:Hand gesture-based sign language recognition (SLR) is one of the most advanced applications of machine learning, and computer vision uses hand gestures. Although, in the past few years, many researchers have widely explored and studied how to address BSL problems, specific unaddressed issues remain, such as skeleton and transformer-based BSL recognition. In addition, the lack of evaluation of the BSL model in various concealed environmental conditions can prove the generalized property of the existing model by facing daily life signs. As a consequence, existing BSL recognition systems provide a limited perspective of their generalisation ability as they are tested on datasets containing few BSL alphabets that have a wide disparity in gestures and are easy to differentiate. To overcome these limitations, we propose a spatial-temporal attention-based BSL recognition model considering hand joint skeletons extracted from the sequence of images. The main aim of utilising hand skeleton-based BSL data is to ensure the privacy and low-resolution sequence of images, which need minimum computational cost and low hardware configurations. Our model captures discriminative structural displacements and short-range dependency based on unified joint features projected onto high-dimensional feature space. Specifically, the use of Separable TCN combined with a powerful multi-head spatial-temporal attention architecture generated high-performance accuracy. The extensive experiments with a proposed dataset and two benchmark BSL datasets with a wide range of evaluations, such as intra- and inter-dataset evaluation settings, demonstrated that our proposed models achieve competitive performance with extremely low computational complexity and run faster than existing models.
Abstract:Surface electromyography (EMG) serves as a pivotal tool in hand gesture recognition and human-computer interaction, offering a non-invasive means of signal acquisition. This study presents a novel methodology for classifying hand gestures using EMG signals. To address the challenges associated with feature extraction where, we explored 23 distinct morphological, time domain and frequency domain feature extraction techniques. However, the substantial size of the features may increase the computational complexity issues that can hinder machine learning algorithm performance. We employ an efficient feature selection approach, specifically an extra tree classifier, to mitigate this. The selected potential feature fed into the various machine learning-based classification algorithms where our model achieved 97.43\% accuracy with the KNN algorithm and selected feature. By leveraging a comprehensive feature extraction and selection strategy, our methodology enhances the accuracy and usability of EMG-based hand gesture recognition systems. The higher performance accuracy proves the effectiveness of the proposed model over the existing system. \keywords{EMG signal, machine learning approach, hand gesture recognition.
Abstract:The prevention of falls is paramount in modern healthcare, particularly for the elderly, as falls can lead to severe injuries or even fatalities. Additionally, the growing incidence of falls among the elderly, coupled with the urgent need to prevent suicide attempts resulting from medication overdose, underscores the critical importance of accurate and efficient fall detection methods. In this scenario, a computer-aided fall detection system is inevitable to save elderly people's lives worldwide. Many researchers have been working to develop fall detection systems. However, the existing fall detection systems often struggle with issues such as unsatisfactory performance accuracy, limited robustness, high computational complexity, and sensitivity to environmental factors due to a lack of effective features. In response to these challenges, this paper proposes a novel three-stream spatial-temporal feature-based fall detection system. Our system incorporates joint skeleton-based spatial and temporal Graph Convolutional Network (GCN) features, joint motion-based spatial and temporal GCN features, and residual connections-based features. Each stream employs adaptive graph-based feature aggregation and consecutive separable convolutional neural networks (Sep-TCN), significantly reducing computational complexity and model parameters compared to prior systems. Experimental results across multiple datasets demonstrate the superior effectiveness and efficiency of our proposed system, with accuracies of 99.51\%, 99.15\%, 99.79\% and 99.85 \% achieved on the ImViA, UR-Fall, Fall-UP and FU-Kinect datasets, respectively. The remarkable performance of our system highlights its superiority, efficiency, and generalizability in real-world fall detection scenarios, offering significant advancements in healthcare and societal well-being.
Abstract:The Bengali language is the 5th most spoken native and 7th most spoken language in the world, and Bengali handwritten character recognition has attracted researchers for decades. However, other languages such as English, Arabic, Turkey, and Chinese character recognition have contributed significantly to developing handwriting recognition systems. Still, little research has been done on Bengali character recognition because of the similarity of the character, curvature and other complexities. However, many researchers have used traditional machine learning and deep learning models to conduct Bengali hand-written recognition. The study employed a convolutional neural network (CNN) with ensemble transfer learning and a multichannel attention network. We generated the feature from the two branches of the CNN, including Inception Net and ResNet and then produced an ensemble feature fusion by concatenating them. After that, we applied the attention module to produce the contextual information from the ensemble features. Finally, we applied a classification module to refine the features and classification. We evaluated the proposed model using the CAMTERdb 3.1.2 data set and achieved 92\% accuracy for the raw dataset and 98.00\% for the preprocessed dataset. We believe that our contribution to the Bengali handwritten character recognition domain will be considered a great development.