Abstract:Predictive modeling in healthcare continues to be an active actuarial research topic as more insurance companies aim to maximize the potential of Machine Learning approaches to increase their productivity and efficiency. In this paper, the authors deployed three regression-based ensemble ML models that combine variations of decision trees through Extreme Gradient Boosting, Gradient-boosting Machine, and Random Forest) methods in predicting medical insurance costs. Explainable Artificial Intelligence methods SHapley Additive exPlanations and Individual Conditional Expectation plots were deployed to discover and explain the key determinant factors that influence medical insurance premium prices in the dataset. The dataset used comprised 986 records and is publicly available in the KAGGLE repository. The models were evaluated using four performance evaluation metrics, including R-squared, Mean Absolute Error, Root Mean Squared Error, and Mean Absolute Percentage Error. The results show that all models produced impressive outcomes; however, the XGBoost model achieved a better overall performance although it also expanded more computational resources, while the RF model recorded a lesser prediction error and consumed far fewer computing resources than the XGBoost model. Furthermore, we compared the outcome of both XAi methods in identifying the key determinant features that influenced the PremiumPrices for each model and whereas both XAi methods produced similar outcomes, we found that the ICE plots showed in more detail the interactions between each variable than the SHAP analysis which seemed to be more high-level. It is the aim of the authors that the contributions of this study will help policymakers, insurers, and potential medical insurance buyers in their decision-making process for selecting the right policies that meet their specific needs.
Abstract:The data revolution experienced in recent times has thrown up new challenges and opportunities for businesses of all sizes in diverse industries. Big data analytics is already at the forefront of innovations to help make meaningful business decisions from the abundance of raw data available today. Business intelligence and analytics has become a huge trend in todays IT world as companies of all sizes are looking to improve their business processes and scale up using data driven solutions. This paper aims to demonstrate the data analytical process of deriving business intelligence via the historical data of a fictional bike share company seeking to find innovative ways to convert their casual riders to annual paying registered members. The dataset used is freely available as Chicago Divvy Bicycle Sharing Data on Kaggle. The authors used the RTidyverse library in RStudio to analyse the data and followed the six data analysis steps of ask, prepare, process, analyse, share, and act to recommend some actionable approaches the company could adopt to convert casual riders to paying annual members. The findings from this research serve as a valuable case example, of a real world deployment of BIA technologies in the industry, and a demonstration of the data analysis cycle for data practitioners, researchers, and other potential users.
Abstract:Timely and rapid diagnoses are core to informing on optimum interventions that curb the spread of COVID-19. The use of medical images such as chest X-rays and CTs has been advocated to supplement the Reverse-Transcription Polymerase Chain Reaction (RT-PCR) test, which in turn has stimulated the application of deep learning techniques in the development of automated systems for the detection of infections. Decision support systems relax the challenges inherent to the physical examination of images, which is both time consuming and requires interpretation by highly trained clinicians. A review of relevant reported studies to date shows that most deep learning algorithms utilised approaches are not amenable to implementation on resource-constrained devices. Given the rate of infections is increasing, rapid, trusted diagnoses are a central tool in the management of the spread, mandating a need for a low-cost and mobile point-of-care detection systems, especially for middle- and low-income nations. The paper presents the development and evaluation of the performance of lightweight deep learning technique for the detection of COVID-19 using the MobileNetV2 model. Results demonstrate that the performance of the lightweight deep learning model is competitive with respect to heavyweight models but delivers a significant increase in the efficiency of deployment, notably in the lowering of the cost and memory requirements of computing resources.