Abstract:Despite the fact that cancer survivability rates vary greatly between stages, traditional survival prediction models have frequently been trained and assessed using examples from all combined phases of the disease. This method may result in an overestimation of performance and ignore the stage-specific variations. Using the SEER dataset, we created and verified explainable machine learning (ML) models to predict stage-specific cancer survivability in colorectal, stomach, and liver cancers. ML-based cancer survival analysis has been a long-standing topic in the literature; however, studies involving the explainability and transparency of ML survivability models are limited. Our use of explainability techniques, including SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME), enabled us to illustrate significant feature-cancer stage interactions that would have remained hidden in traditional black-box models. We identified how certain demographic and clinical variables influenced survival differently across cancer stages and types. These insights provide not only transparency but also clinical relevance, supporting personalized treatment planning. By focusing on stage-specific models, this study provides new insights into the most important factors at each stage of cancer, offering transparency and potential clinical relevance to support personalized treatment planning.




Abstract:Availability of domain-specific datasets is an essential problem in object detection. Maritime vessel detection of inshore and offshore datasets is no exception, there is a limited number of studies addressing this need. For that reason, we collected a dataset of images of maritime vessels taking into account different factors: background variation, atmospheric conditions, illumination, visible proportion, occlusion and scale variation. Vessel instances (including 9 types of vessels), seamarks and miscellaneous floaters were precisely annotated: we employed a first round of labelling and subsequently, we used the CSRT [1] tracker to trace inconsistencies and relabel inadequate label instances. Moreover, we evaluated the the out-of-the-box performance of four prevalent object detection algorithms (Faster R-CNN [2], R-FCN [3], SSD [4] and EfficientDet [5]). The algorithms were previously trained on the Microsoft COCO dataset. We compare their accuracy based on feature extractor and object size. Our experiments show that Faster R-CNN with Inception-Resnet v2 outperforms the other algorithms, except in the large object category where EfficientDet surpasses the latter.