Abstract:Objective: Head impact information including impact directions, speeds and force are important to study traumatic brain injury, design and evaluate protective gears. This study presents a deep learning model developed to accurately predict head impact information, including location, speed, orientation, and force, based on head kinematics during helmeted impacts. Methods: Leveraging a dataset of 16,000 simulated helmeted head impacts using the Riddell helmet finite element model, we implemented a Long Short-Term Memory (LSTM) network to process the head kinematics: tri-axial linear accelerations and angular velocities. Results: The models accurately predict the impact parameters describing impact location, direction, speed, and the impact force profile with R2 exceeding 70% for all tasks. Further validation was conducted using an on-field dataset recorded by instrumented mouthguards and videos, consisting of 79 head impacts in which the impact location can be clearly identified. The deep learning model significantly outperformed existing methods, achieving a 79.7% accuracy in identifying impact locations, compared to lower accuracies with traditional methods (the highest accuracy of existing methods is 49.4%). Conclusion: The precision underscores the model's potential in enhancing helmet design and safety in sports by providing more accurate impact data. Future studies should test the models across various helmets and sports on large in vivo datasets to validate the accuracy of the models, employing techniques like transfer learning to broaden its effectiveness.
Abstract:Brain deformation caused by a head impact leads to traumatic brain injury (TBI). The maximum principal strain (MPS) was used to measure the extent of brain deformation and predict injury, and the recent evidence has indicated that incorporating the maximum principal strain rate (MPSR) and the product of MPS and MPSR, denoted as MPSxSR, enhances the accuracy of TBI prediction. However, ambiguities have arisen about the calculation of MPSR. Two schemes have been utilized: one (MPSR1) is to use the time derivative of MPS, and another (MPSR2) is to use the first eigenvalue of the strain rate tensor. Both MPSR1 and MPSR2 have been applied in previous studies to predict TBI. To quantify the discrepancies between these two methodologies, we conducted a comparison of these two MPSR methodologies across nine in-vivo and in-silico head impact datasets and found that 95MPSR1 was 5.87% larger than 95MPSR2, and 95MPSxSR1 was 2.55% larger than 95MPSxSR2. Across every element in all head impacts, MPSR1 was 8.28% smaller than MPSR2, and MPSxSR1 was 8.11% smaller than MPSxSR2. Furthermore, logistic regression models were trained to predict TBI based on the MPSR (or MPSxSR), and no significant difference was observed in the predictability across different variables. The consequence of misuse of MPSR and MPSxSR thresholds (i.e. compare threshold of 95MPSR1 with value from 95MPSR2 to determine if the impact is injurious) was investigated, and the resulting false rates were found to be around 1%. The evidence suggested that these two methodologies were not significantly different in detecting TBI.
Abstract:Accurately labeling biomedical data presents a challenge. Traditional semi-supervised learning methods often under-utilize available unlabeled data. To address this, we propose a novel reliability-based training data cleaning method employing inductive conformal prediction (ICP). This method capitalizes on a small set of accurately labeled training data and leverages ICP-calculated reliability metrics to rectify mislabeled data and outliers within vast quantities of noisy training data. The efficacy of the method is validated across three classification tasks within distinct modalities: filtering drug-induced-liver-injury (DILI) literature with title and abstract, predicting ICU admission of COVID-19 patients through CT radiomics and electronic health records, and subtyping breast cancer using RNA-sequencing data. Varying levels of noise to the training labels were introduced through label permutation. Results show significant enhancements in classification performance: accuracy enhancement in 86 out of 96 DILI experiments (up to 11.4%), AUROC and AUPRC enhancements in all 48 COVID-19 experiments (up to 23.8% and 69.8%), and accuracy and macro-average F1 score improvements in 47 out of 48 RNA-sequencing experiments (up to 74.6% and 89.0%). Our method offers the potential to substantially boost classification performance in multi-modal biomedical machine learning tasks. Importantly, it accomplishes this without necessitating an excessive volume of meticulously curated training data.
Abstract:Machine learning head models (MLHMs) are developed to estimate brain deformation for early detection of traumatic brain injury (TBI). However, the overfitting to simulated impacts and the lack of generalizability caused by distributional shift of different head impact datasets hinders the broad clinical applications of current MLHMs. We propose brain deformation estimators that integrates unsupervised domain adaptation with a deep neural network to predict whole-brain maximum principal strain (MPS) and MPS rate (MPSR). With 12,780 simulated head impacts, we performed unsupervised domain adaptation on on-field head impacts from 302 college football (CF) impacts and 457 mixed martial arts (MMA) impacts using domain regularized component analysis (DRCA) and cycle-GAN-based methods. The new model improved the MPS/MPSR estimation accuracy, with the DRCA method significantly outperforming other domain adaptation methods in prediction accuracy (p<0.001): MPS RMSE: 0.027 (CF) and 0.037 (MMA); MPSR RMSE: 7.159 (CF) and 13.022 (MMA). On another two hold-out test sets with 195 college football impacts and 260 boxing impacts, the DRCA model significantly outperformed the baseline model without domain adaptation in MPS and MPSR estimation accuracy (p<0.001). The DRCA domain adaptation reduces the MPS/MPSR estimation error to be well below TBI thresholds, enabling accurate brain deformation estimation to detect TBI in future clinical applications.
Abstract:Wearable sensors for measuring head kinematics can be noisy due to imperfect interfaces with the body. Mouthguards are used to measure head kinematics during impacts in traumatic brain injury (TBI) studies, but deviations from reference kinematics can still occur due to potential looseness. In this study, deep learning is used to compensate for the imperfect interface and improve measurement accuracy. A set of one-dimensional convolutional neural network (1D-CNN) models was developed to denoise mouthguard kinematics measurements along three spatial axes of linear acceleration and angular velocity. The denoised kinematics had significantly reduced errors compared to reference kinematics, and reduced errors in brain injury criteria and tissue strain and strain rate calculated via finite element modeling. The 1D-CNN models were also tested on an on-field dataset of college football impacts and a post-mortem human subject dataset, with similar denoising effects observed. The models can be used to improve detection of head impacts and TBI risk evaluation, and potentially extended to other sensors measuring kinematics.
Abstract:Drug-induced liver injury (DILI) describes the adverse effects of drugs that damage liver. Life-threatening results including liver failure or death were also reported in severe DILI cases. Therefore, DILI-related events are strictly monitored for all approved drugs and the liver toxicity became important assessments for new drug candidates. These DILI-related reports are documented in hospital records, in clinical trial results, and also in research papers that contain preliminary in vitro and in vivo experiments. Conventionally, data extraction from previous publications relies heavily on resource-demanding manual labelling, which considerably decreased the efficiency of the information extraction process. The recent development of artificial intelligence, particularly, the rise of natural language processing (NLP) techniques, enabled the automatic processing of biomedical texts. In this study, based on around 28,000 papers (titles and abstracts) provided by the Critical Assessment of Massive Data Analysis (CAMDA) challenge, we benchmarked model performances on filtering out DILI literature. Among four word vectorization techniques, the model using term frequency-inverse document frequency (TF-IDF) and logistic regression outperformed others with an accuracy of 0.957 with our in-house test set. Furthermore, an ensemble model with similar overall performances was implemented and was fine-tuned to lower the false-negative cases to avoid neglecting potential DILI reports. The ensemble model achieved a high accuracy of 0.954 and an F1 score of 0.955 in the hold-out validation data provided by the CAMDA committee. Moreover, important words in positive/negative predictions were identified via model interpretation. Overall, the ensemble model reached satisfactory classification results, which can be further used by researchers to rapidly filter DILI-related literature.
Abstract:Strain and strain rate are effective traumatic brain injury predictors. Kinematics-based models estimating these metrics suffer from significant different distributions of both kinematics and the injury metrics across head impact types. To address this, previous studies focus on the kinematics but not the injury metrics. We have previously shown the kinematic features vary largely across head impact types, resulting in different patterns of brain deformation. This study analyzes the spatial distribution of brain deformation and applies principal component analysis (PCA) to extract the representative patterns of injury metrics (maximum principal strain (MPS), MPS rate (MPSR) and MPSXMPSR) in four impact types (simulation, football, mixed martial arts and car crashes). We apply PCA to decompose the patterns of the injury metrics for all impacts in each impact type, and investigate the distributions among brain regions using the first principal component (PC1). Furthermore, we developed a deep learning head model (DLHM) to predict PC1 and then inverse-transform to predict for all brain elements. PC1 explained >80% variance on the datasets. Based on PC1 coefficients, the corpus callosum and midbrain exhibit high variance on all datasets. We found MPSXMPSR the most sensitive metric on which the top 5% of severe impacts further deviates from the mean and there is a higher variance among the severe impacts. Finally, the DLHM reached mean absolute errors of <0.018 for MPS, <3.7 (1/s) for MPSR and <1.1 (1/s) for MPSXMPSR, much smaller than the injury thresholds. The brain injury metric in a dataset can be decomposed into mean components and PC1 with high explained variance. The brain dynamics decomposition enables better interpretation of the patterns in brain injury metrics and the sensitivity of brain injury metrics across impact types. The decomposition also reduces the dimensionality of DLHM.
Abstract:Human taste sensation can be qualitatively described with surface electromyography. However, the pattern recognition models trained on one subject (the source domain) do not generalize well on other subjects (the target domain). To improve the generalizability and transferability of taste sensation models developed with sEMG data, two methods were innovatively applied in this study: domain regularized component analysis (DRCA) and conformal prediction with shrunken centroids (CPSC). The effectiveness of these two methods was investigated independently in an unlabeled data augmentation process with the unlabeled data from the target domain, and the same cross-user adaptation pipeline were conducted on six subjects. The results show that DRCA improved the classification accuracy on six subjects (p < 0.05), compared with the baseline models trained only with the source domain data;, while CPSC did not guarantee the accuracy improvement. Furthermore, the combination of DRCA and CPSC presented statistically significant improvement (p < 0.05) in classification accuracy on six subjects. The proposed strategy combining DRCA and CPSC showed its effectiveness in addressing the cross-user data distribution drift in sEMG-based taste sensation recognition application. It also shows the potential in more cross-user adaptation applications.
Abstract:Brain strain and strain rate are effective in predicting traumatic brain injury (TBI) caused by head impacts. However, state-of-the-art finite element modeling (FEM) demands considerable computational time in the computation, limiting its application in real-time TBI risk monitoring. To accelerate, machine learning head models (MLHMs) were developed, and the model accuracy was found to decrease when the training/test datasets were from different head impacts types. However, the size of dataset for specific impact types may not be enough for model training. To address the computational cost of FEM, the limited strain rate prediction, and the generalizability of MLHMs to on-field datasets, we propose data fusion and transfer learning to develop a series of MLHMs to predict the maximum principal strain (MPS) and maximum principal strain rate (MPSR). We trained and tested the MLHMs on 13,623 head impacts from simulations, American football, mixed martial arts, car crash, and compared against the models trained on only simulations or only on-field impacts. The MLHMs developed with transfer learning are significantly more accurate in estimating MPS and MPSR than other models, with a mean absolute error (MAE) smaller than 0.03 in predicting MPS and smaller than 7 (1/s) in predicting MPSR on all impact datasets. The MLHMs can be applied to various head impact types for rapidly and accurately calculating brain strain and strain rate. Besides the clinical applications in real-time brain strain and strain rate monitoring, this model helps researchers estimate the brain strain and strain rate caused by head impacts more efficiently than FEM.
Abstract:Traumatic brain injury can be caused by various types of head impacts. However, due to different kinematic characteristics, many brain injury risk estimation models are not generalizable across the variety of impacts that humans may sustain. The current definitions of head impact subtypes are based on impact sources (e.g., football, traffic accident), which may not reflect the intrinsic kinematic similarities of impacts across the impact sources. To investigate the potential new definitions of impact subtypes based on kinematics, 3,161 head impacts from various sources including simulation, college football, mixed martial arts, and car racing were collected. We applied the K-means clustering to cluster the impacts on 16 standardized temporal features from head rotation kinematics. Then, we developed subtype-specific ridge regression models for cumulative strain damage (using the threshold of 15%), which significantly improved the estimation accuracy compared with the baseline method which mixed impacts from different sources and developed one model (R^2 from 0.7 to 0.9). To investigate the effect of kinematic features, we presented the top three critical features (maximum resultant angular acceleration, maximum angular acceleration along the z-axis, maximum linear acceleration along the y-axis) based on regression accuracy and used logistic regression to find the critical points for each feature that partitioned the subtypes. This study enables researchers to define head impact subtypes in a data-driven manner, which leads to more generalizable brain injury risk estimation.