Abstract:Progressive driver behavior analytics is crucial for improving road safety and mitigating the issues caused by aggressive or inattentive driving. Previous studies have employed machine learning and deep learning techniques, which often result in low feature optimization, thereby compromising both high performance and interpretability. To fill these voids, this paper proposes a hybrid approach to driver behavior analysis that uses a 12,857-row and 18-column data set taken from Kaggle. After applying preprocessing techniques such as label encoding, random oversampling, and standard scaling, 13 machine learning algorithms were tested. The Random Forest Classifier achieved an accuracy of 95%. After deploying the LIME technique in XAI, the top 10 features with the most significant positive and negative influence on accuracy were identified, and the same algorithms were retrained. The accuracy of the Random Forest Classifier decreased slightly to 94.2%, confirming that the efficiency of the model can be improved without sacrificing performance. This hybrid model can provide a return on investment in terms of the predictive power and explainability of the driver behavior process.
Abstract:The rapid advancement of deep learning in medical image analysis has greatly enhanced the accuracy of skin cancer classification. However, current state-of-the-art models, especially those based on transfer learning like ResNet50, come with significant computational overhead, rendering them impractical for deployment in resource-constrained environments. This study proposes a custom CNN model that achieves a 96.7\% reduction in parameters (from 23.9 million in ResNet50 to 692,000) while maintaining a classification accuracy deviation of less than 0.022\%. Our empirical analysis of the HAM10000 dataset reveals that although transfer learning models provide a marginal accuracy improvement of approximately 0.022\%, they result in a staggering 13,216.76\% increase in FLOPs, considerably raising computational costs and inference latency. In contrast, our lightweight CNN architecture, which encompasses only 30.04 million FLOPs compared to ResNet50's 4.00 billion, significantly reduces energy consumption, memory footprint, and inference time. These findings underscore the trade-off between the complexity of deep models and their real-world feasibility, positioning our optimized CNN as a practical solution for mobile and edge-based skin cancer diagnostics.
Abstract:Establishing a communication bridge by transferring data driven from different embedded sensors via internet or reconcilable network protocols between enormous number of distinctively addressable objects or "things", is known as the Internet of Things (IoT). IoT can be amalgamated with multitudinous objects such as thermostats, cars, lights, refrigerators, and many more appliances which will be able to build a connection via internet. Where objects of our diurnal life can establish a network connection and get smarter with IoT, robotics can be another aspect which will get beneficial to be brought under the concept of IoT and is able to add a new perception in robotics having "Mechanical Smart Intelligence" which is generally called "Internet of Robotic Things" (IoRT). A robotic arm is a part of robotics where it is usually a programmable mechanical arm which has human arm like functionalities. In this paper, IoRT will be represented by a 5 DoF (degree of freedoms) Robotic Arm which will be able to communicate as an IoRT device, controlled with heterogeneous devices using IoT and "Cloud Robotics".
Abstract:There are a variety of features of the human voice that can be classified as pitch, timbre, loudness, and vocal tone. It is observed in numerous incidents that human expresses their feelings using different vocal qualities when they are speaking. The primary objective of this research is to recognize different emotions of human beings such as anger, sadness, fear, neutrality, disgust, pleasant surprise, and happiness by using several MATLAB functions namely, spectral descriptors, periodicity, and harmonicity. To accomplish the work, we analyze the CREMA-D (Crowd-sourced Emotional Multimodal Actors Data) & TESS (Toronto Emotional Speech Set) datasets of human speech. The audio file contains data that have various characteristics (e.g., noisy, speedy, slow) thereby the efficiency of the ML (Machine Learning) models increases significantly. The EMD (Empirical Mode Decomposition) is utilized for the process of signal decomposition. Then, the features are extracted through the use of several techniques such as the MFCC, GTCC, spectral centroid, roll-off point, entropy, spread, flux, harmonic ratio, energy, skewness, flatness, and audio delta. The data is trained using some renowned ML models namely, Support Vector Machine, Neural Network, Ensemble, and KNN. The algorithms show an accuracy of 67.7%, 63.3%, 61.6%, and 59.0% respectively for the test data and 77.7%, 76.1%, 99.1%, and 61.2% for the training data. We have conducted experiments using Matlab and the result shows that our model is very prominent and flexible than existing similar works.