Abstract:Advances in digital pathology and artificial intelligence (AI) offer promising opportunities for clinical decision support and enhancing diagnostic workflows. Previous studies already demonstrated AI's potential for automated Gleason grading, but lack state-of-the-art methodology and model reusability. To address this issue, we propose DeepGleason: an open-source deep neural network based image classification system for automated Gleason grading using whole-slide histopathology images from prostate tissue sections. Implemented with the standardized AUCMEDI framework, our tool employs a tile-wise classification approach utilizing fine-tuned image preprocessing techniques in combination with a ConvNeXt architecture which was compared to various state-of-the-art architectures. The neural network model was trained and validated on an in-house dataset of 34,264 annotated tiles from 369 prostate carcinoma slides. We demonstrated that DeepGleason is capable of highly accurate and reliable Gleason grading with a macro-averaged F1-score of 0.806, AUC of 0.991, and Accuracy of 0.974. The internal architecture comparison revealed that the ConvNeXt model was superior performance-wise on our dataset to established and other modern architectures like transformers. Furthermore, we were able to outperform the current state-of-the-art in tile-wise fine-classification with a sensitivity and specificity of 0.94 and 0.98 for benign vs malignant detection as well as of 0.91 and 0.75 for Gleason 3 vs Gleason 4 & 5 classification, respectively. Our tool contributes to the wider adoption of AI-based Gleason grading within the research community and paves the way for broader clinical application of deep learning models in digital pathology. DeepGleason is open-source and publicly available for research application in the following Git repository: https://github.com/frankkramer-lab/DeepGleason.
Abstract:Prostate cancer is a dominant health concern calling for advanced diagnostic tools. Utilizing digital pathology and artificial intelligence, this study explores the potential of 11 deep neural network architectures for automated Gleason grading in prostate carcinoma focusing on comparing traditional and recent architectures. A standardized image classification pipeline, based on the AUCMEDI framework, facilitated robust evaluation using an in-house dataset consisting of 34,264 annotated tissue tiles. The results indicated varying sensitivity across architectures, with ConvNeXt demonstrating the strongest performance. Notably, newer architectures achieved superior performance, even though with challenges in differentiating closely related Gleason grades. The ConvNeXt model was capable of learning a balance between complexity and generalizability. Overall, this study lays the groundwork for enhanced Gleason grading systems, potentially improving diagnostic efficiency for prostate cancer.
Abstract:Performance measures are an important tool for assessing and comparing different medical image segmentation algorithms. Unfortunately, the current measures have their weaknesses when it comes to assessing certain edge cases. These limitations arouse when images with a very small region of interest or without a region of interest at all are assessed. As a solution for these limitations, we propose a new medical image segmentation metric: MISm. To evaluate MISm, the popular metrics in the medical image segmentation and MISm were compared using images of magnet resonance tomography from several scenarios. In order to allow application in the community and reproducibility of experimental results, we included MISm in the publicly available evaluation framework MISeval: https://github.com/frankkramer-lab/miseval/tree/master/miseval
Abstract:Correct performance assessment is crucial for evaluating modern artificial intelligence algorithms in medicine like deep-learning based medical image segmentation models. However, there is no universal metric library in Python for standardized and reproducible evaluation. Thus, we propose our open-source publicly available Python package MISeval: a metric library for Medical Image Segmentation Evaluation. The implemented metrics can be intuitively used and easily integrated into any performance assessment pipeline. The package utilizes modern CI/CD strategies to ensure functionality and stability. MISeval is available from PyPI (miseval) and GitHub: https://github.com/frankkramer-lab/miseval.