Complejo Hospitalario
Abstract:This paper presents a comprehensive study on the evaluation of explanatory capabilities of machine learning models, with a focus on Decision Trees, Random Forest and XGBoost models using a pancreatic cancer dataset. We use Human-in-the-Loop related techniques and medical guidelines as a source of domain knowledge to establish the importance of the different features that are relevant to establish a pancreatic cancer treatment. These features are not only used as a dimensionality reduction approach for the machine learning models, but also as way to evaluate the explainability capabilities of the different models using agnostic and non-agnostic explainability techniques. To facilitate interpretation of explanatory results, we propose the use of similarity measures such as the Weighted Jaccard Similarity coefficient. The goal is to not only select the best performing model but also the one that can best explain its conclusions and aligns with human domain knowledge.