This paper significantly improves on, and finishes to validate, the approach proposed in "Application of Machine Learning to Construction Injury Prediction" (Tixier et al. 2016 [1]). Like in the original study, we use NLP to extract fundamental attributes from raw incident reports and machine learning models are trained to predict safety outcomes (here, these outcomes are injury severity, injury type, bodypart impacted, and incident type). However, in this study, safety outcomes were not extracted via NLP but are independent (human annotations), eliminating any potential source of artificial correlation between predictors and predictands. Results show that attributes are still highly predictive, confirming the validity of the original study. Other improvements brought by the current study include the use of (1) a much larger dataset, (2) two new models (XGBoost andlinear SVM), (3) model stacking, (4) a more straight forward experimental setup with more appropriate performance metrics, and (5) an analysis of per-category attribute importance scores. Finally, the injury severity outcome is well predicted, which was not the case in the original study. This is a significant advancement.