Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lars Egevad

Department of Oncology and Pathology, Karolinska Institutet, Stockholm, Sweden

AI-based Prediction of Biochemical Recurrence from Biopsy and Prostatectomy Samples

Jan 28, 2026

Andrea Camilloni, Chiara Micoli, Nita Mulliqi, Erik Everett Palm, Thorgerdur Palsdottir, Kelvin Szolnoky, Xiaoyi Ji, Sol Erika Boman, Andrea Discacciati, Henrik Grönberg(+4 more)

Abstract:Biochemical recurrence (BCR) after radical prostatectomy (RP) is a surrogate marker for aggressive prostate cancer with adverse outcomes, yet current prognostic tools remain imprecise. We trained an AI-based model on diagnostic prostate biopsy slides from the STHLM3 cohort (n = 676) to predict patient-specific risk of BCR, using foundation models and attention-based multiple instance learning. Generalizability was assessed across three external RP cohorts: LEOPARD (n = 508), CHIMERA (n = 95), and TCGA-PRAD (n = 379). The image-based approach achieved 5-year time-dependent AUCs of 0.64, 0.70, and 0.70, respectively. Integrating clinical variables added complementary prognostic value and enabled statistically significant risk stratification. Compared with guideline-based CAPRA-S, AI incrementally improved postoperative prognostication. These findings suggest biopsy-trained histopathology AI can generalize across specimen types to support preoperative and postoperative decision making, but the added value of AI-based multimodal approaches over simpler predictive models should be critically scrutinized in further studies.

* 39 pages, 6 tables, 11 figures

Via

Access Paper or Ask Questions

Artificial Intelligence-Assisted Prostate Cancer Diagnosis for Reduced Use of Immunohistochemistry

Mar 31, 2025

Anders Blilie, Nita Mulliqi, Xiaoyi Ji, Kelvin Szolnoky, Sol Erika Boman, Matteo Titus, Geraldine Martinez Gonzalez, José Asenjo, Marcello Gambacorta, Paolo Libretti(+6 more)

Abstract:Prostate cancer diagnosis heavily relies on histopathological evaluation, which is subject to variability. While immunohistochemical staining (IHC) assists in distinguishing benign from malignant tissue, it involves increased work, higher costs, and diagnostic delays. Artificial intelligence (AI) presents a promising solution to reduce reliance on IHC by accurately classifying atypical glands and borderline morphologies in hematoxylin & eosin (H&E) stained tissue sections. In this study, we evaluated an AI model's ability to minimize IHC use without compromising diagnostic accuracy by retrospectively analyzing prostate core needle biopsies from routine diagnostics at three different pathology sites. These cohorts were composed exclusively of difficult cases where the diagnosing pathologists required IHC to finalize the diagnosis. The AI model demonstrated area under the curve values of 0.951-0.993 for detecting cancer in routine H&E-stained slides. Applying sensitivity-prioritized diagnostic thresholds reduced the need for IHC staining by 44.4%, 42.0%, and 20.7% in the three cohorts investigated, without a single false negative prediction. This AI model shows potential for optimizing IHC use, streamlining decision-making in prostate pathology, and alleviating resource burdens.

* 29 pages, 5 figures and 3 tables

Via

Access Paper or Ask Questions

The impact of tissue detection on diagnostic artificial intelligence algorithms in digital pathology

Mar 29, 2025

Sol Erika Boman, Nita Mulliqi, Anders Blilie, Xiaoyi Ji, Kelvin Szolnoky, Einar Gudlaugsson, Emiel A. M. Janssen, Svein R. Kjosavik, José Asenjo, Marcello Gambacorta(+11 more)

Abstract:Tissue detection is a crucial first step in most digital pathology applications. Details of the segmentation algorithm are rarely reported, and there is a lack of studies investigating the downstream effects of a poor segmentation algorithm. Disregarding tissue detection quality could create a bottleneck for downstream performance and jeopardize patient safety if diagnostically relevant parts of the specimen are excluded from analysis in clinical applications. This study aims to determine whether performance of downstream tasks is sensitive to the tissue detection method, and to compare performance of classical and AI-based tissue detection. To this end, we trained an AI model for Gleason grading of prostate cancer in whole slide images (WSIs) using two different tissue detection algorithms: thresholding (classical) and UNet++ (AI). A total of 33,823 WSIs scanned on five digital pathology scanners were used to train the tissue detection AI model. The downstream Gleason grading algorithm was trained and tested using 70,524 WSIs from 13 clinical sites scanned on 13 different scanners. There was a decrease from 116 (0.43%) to 22 (0.08%) fully undetected tissue samples when switching from thresholding-based tissue detection to AI-based, suggesting an AI model may be more reliable than a classical model for avoiding total failures on slides with unusual appearance. On the slides where tissue could be detected by both algorithms, no significant difference in overall Gleason grading performance was observed. However, tissue detection dependent clinically significant variations in AI grading were observed in 3.5% of malignant slides, highlighting the importance of robust tissue detection for optimal clinical performance of diagnostic AI.

* 25 pages, 2 tables, 3 figures, 1 supplementary figure

Via

Access Paper or Ask Questions

Foundation Models -- A Panacea for Artificial Intelligence in Pathology?

Feb 28, 2025

Nita Mulliqi, Anders Blilie, Xiaoyi Ji, Kelvin Szolnoky, Henrik Olsson, Sol Erika Boman, Matteo Titus, Geraldine Martinez Gonzalez, Julia Anna Mielcarz, Masi Valkonen(+21 more)

Figure 1 for Foundation Models -- A Panacea for Artificial Intelligence in Pathology?

Figure 2 for Foundation Models -- A Panacea for Artificial Intelligence in Pathology?

Figure 3 for Foundation Models -- A Panacea for Artificial Intelligence in Pathology?

Figure 4 for Foundation Models -- A Panacea for Artificial Intelligence in Pathology?

Abstract:The role of artificial intelligence (AI) in pathology has evolved from aiding diagnostics to uncovering predictive morphological patterns in whole slide images (WSIs). Recently, foundation models (FMs) leveraging self-supervised pre-training have been widely advocated as a universal solution for diverse downstream tasks. However, open questions remain about their clinical applicability and generalization advantages over end-to-end learning using task-specific (TS) models. Here, we focused on AI with clinical-grade performance for prostate cancer diagnosis and Gleason grading. We present the largest validation of AI for this task, using over 100,000 core needle biopsies from 7,342 patients across 15 sites in 11 countries. We compared two FMs with a fully end-to-end TS model in a multiple instance learning framework. Our findings challenge assumptions that FMs universally outperform TS models. While FMs demonstrated utility in data-scarce scenarios, their performance converged with - and was in some cases surpassed by - TS models when sufficient labeled training data were available. Notably, extensive task-specific training markedly reduced clinically significant misgrading, misdiagnosis of challenging morphologies, and variability across different WSI scanners. Additionally, FMs used up to 35 times more energy than the TS model, raising concerns about their sustainability. Our results underscore that while FMs offer clear advantages for rapid prototyping and research, their role as a universal solution for clinically applicable medical AI remains uncertain. For high-stakes clinical applications, rigorous validation and consideration of task-specific training remain critically important. We advocate for integrating the strengths of FMs and end-to-end learning to achieve robust and resource-efficient AI pathology solutions fit for clinical use.

* 50 pages, 15 figures and an appendix (study protocol) which is previously published, see https://doi.org/10.1101/2024.07.04.24309948

Via

Access Paper or Ask Questions

Physical Color Calibration of Digital Pathology Scanners for Robust Artificial Intelligence Assisted Cancer Diagnosis

Jul 07, 2023

Xiaoyi Ji, Richard Salmon, Nita Mulliqi, Umair Khan, Yinxi Wang, Anders Blilie, Henrik Olsson, Bodil Ginnerup Pedersen, Karina Dalsgaard Sørensen, Benedicte Parm Ulhøi(+7 more)

Figure 1 for Physical Color Calibration of Digital Pathology Scanners for Robust Artificial Intelligence Assisted Cancer Diagnosis

Figure 2 for Physical Color Calibration of Digital Pathology Scanners for Robust Artificial Intelligence Assisted Cancer Diagnosis

Figure 3 for Physical Color Calibration of Digital Pathology Scanners for Robust Artificial Intelligence Assisted Cancer Diagnosis

Figure 4 for Physical Color Calibration of Digital Pathology Scanners for Robust Artificial Intelligence Assisted Cancer Diagnosis

Abstract:The potential of artificial intelligence (AI) in digital pathology is limited by technical inconsistencies in the production of whole slide images (WSIs), leading to degraded AI performance and posing a challenge for widespread clinical application as fine-tuning algorithms for each new site is impractical. Changes in the imaging workflow can also lead to compromised diagnoses and patient safety risks. We evaluated whether physical color calibration of scanners can standardize WSI appearance and enable robust AI performance. We employed a color calibration slide in four different laboratories and evaluated its impact on the performance of an AI system for prostate cancer diagnosis on 1,161 WSIs. Color standardization resulted in consistently improved AI model calibration and significant improvements in Gleason grading performance. The study demonstrates that physical color calibration provides a potential solution to the variation introduced by different scanners, making AI-based cancer diagnostics more reliable and applicable in clinical settings.

Via

Access Paper or Ask Questions

Using deep learning to detect patients at risk for prostate cancer despite benign biopsies

Jul 31, 2021

Boing Liu, Yinxi Wang, Philippe Weitz, Johan Lindberg, Johan Hartman, Lars Egevad, Henrik Grönberg, Martin Eklund, Mattias Rantalainen

Figure 1 for Using deep learning to detect patients at risk for prostate cancer despite benign biopsies

Figure 2 for Using deep learning to detect patients at risk for prostate cancer despite benign biopsies

Abstract:Background: Transrectal ultrasound guided systematic biopsies of the prostate is a routine procedure to establish a prostate cancer diagnosis. However, the 10-12 prostate core biopsies only sample a relatively small volume of the prostate, and tumour lesions in regions between biopsy cores can be missed, leading to a well-known low sensitivity to detect clinically relevant cancer. As a proof-of-principle, we developed and validated a deep convolutional neural network model to distinguish between morphological patterns in benign prostate biopsy whole slide images from men with and without established cancer. Methods: This study included 14,354 hematoxylin and eosin stained whole slide images from benign prostate biopsies from 1,508 men in two groups: men without an established prostate cancer (PCa) diagnosis and men with at least one core biopsy diagnosed with PCa. 80% of the participants were assigned as training data and used for model optimization (1,211 men), and the remaining 20% (297 men) as a held-out test set used to evaluate model performance. An ensemble of 10 deep convolutional neural network models was optimized for classification of biopsies from men with and without established cancer. Hyperparameter optimization and model selection was performed by cross-validation in the training data . Results: Area under the receiver operating characteristic curve (ROC-AUC) was estimated as 0.727 (bootstrap 95% CI: 0.708-0.745) on biopsy level and 0.738 (bootstrap 95% CI: 0.682 - 0.796) on man level. At a specificity of 0.9 the model had an estimated sensitivity of 0.348. Conclusion: The developed model has the ability to detect men with risk of missed PCa due to under-sampling of the prostate. The proposed model has the potential to reduce the number of false negative cases in routine systematic prostate biopsies and to indicate men who could benefit from MRI-guided re-biopsy.

* 13 pages, 3 figures

Via

Access Paper or Ask Questions

Transcriptome-wide prediction of prostate cancer gene expression from histopathology images using co-expression based convolutional neural networks

Apr 19, 2021

Philippe Weitz, Yinxi Wang, Kimmo Kartasalo, Lars Egevad, Johan Lindberg, Henrik Grönberg, Martin Eklund, Mattias Rantalainen

Figure 1 for Transcriptome-wide prediction of prostate cancer gene expression from histopathology images using co-expression based convolutional neural networks

Figure 2 for Transcriptome-wide prediction of prostate cancer gene expression from histopathology images using co-expression based convolutional neural networks

Figure 3 for Transcriptome-wide prediction of prostate cancer gene expression from histopathology images using co-expression based convolutional neural networks

Figure 4 for Transcriptome-wide prediction of prostate cancer gene expression from histopathology images using co-expression based convolutional neural networks

Abstract:Molecular phenotyping by gene expression profiling is common in contemporary cancer research and in molecular diagnostics. However, molecular profiling remains costly and resource intense to implement, and is just starting to be introduced into clinical diagnostics. Molecular changes, including genetic alterations and gene expression changes, occuring in tumors cause morphological changes in tissue, which can be observed on the microscopic level. The relationship between morphological patterns and some of the molecular phenotypes can be exploited to predict molecular phenotypes directly from routine haematoxylin and eosin (H&E) stained whole slide images (WSIs) using deep convolutional neural networks (CNNs). In this study, we propose a new, computationally efficient approach for disease specific modelling of relationships between morphology and gene expression, and we conducted the first transcriptome-wide analysis in prostate cancer, using CNNs to predict bulk RNA-sequencing estimates from WSIs of H&E stained tissue. The work is based on the TCGA PRAD study and includes both WSIs and RNA-seq data for 370 patients. Out of 15586 protein coding and sufficiently frequently expressed transcripts, 6618 had predicted expression significantly associated with RNA-seq estimates (FDR-adjusted p-value < 1*10-4) in a cross-validation. 5419 (81.9%) of these were subsequently validated in a held-out test set. We also demonstrate the ability to predict a prostate cancer specific cell cycle progression score directly from WSIs. These findings suggest that contemporary computer vision models offer an inexpensive and scalable solution for prediction of gene expression phenotypes directly from WSIs, providing opportunity for cost-effective large-scale research studies and molecular diagnostics.

Via

Access Paper or Ask Questions

Detection of Perineural Invasion in Prostate Needle Biopsies with Deep Neural Networks

Apr 03, 2020

Peter Ström, Kimmo Kartasalo, Pekka Ruusuvuori, Henrik Grönberg, Hemamali Samaratunga, Brett Delahunt, Toyonori Tsuzuki, Lars Egevad, Martin Eklund

Figure 1 for Detection of Perineural Invasion in Prostate Needle Biopsies with Deep Neural Networks

Figure 2 for Detection of Perineural Invasion in Prostate Needle Biopsies with Deep Neural Networks

Abstract:Background: The detection of perineural invasion (PNI) by carcinoma in prostate biopsies has been shown to be associated with poor prognosis. The assessment and quantification of PNI is; however, labor intensive. In the study we aimed to develop an algorithm based on deep neural networks to aid pathologists in this task. Methods: We collected, digitized and pixel-wise annotated the PNI findings in each of the approximately 80,000 biopsy cores from the 7,406 men who underwent biopsy in the prospective and diagnostic STHLM3 trial between 2012 and 2014. In total, 485 biopsy cores showed PNI. We also digitized more than 10% (n=8,318) of the PNI negative biopsy cores. Digitized biopsies from a random selection of 80% of the men were used to build deep neural networks, and the remaining 20% were used to evaluate the performance of the algorithm. Results: For the detection of PNI in prostate biopsy cores the network had an estimated area under the receiver operating characteristics curve of 0.98 (95% CI 0.97-0.99) based on 106 PNI positive cores and 1,652 PNI negative cores in the independent test set. For the pre-specified operating point this translates to sensitivity of 0.87 and specificity of 0.97. The corresponding positive and negative predictive values were 0.67 and 0.99, respectively. For localizing the regions of PNI within a slide we estimated an average intersection over union of 0.50 (CI: 0.46-0.55). Conclusion: We have developed an algorithm based on deep neural networks for detecting PNI in prostate biopsies with apparently acceptable diagnostic properties. These algorithms have the potential to aid pathologists in the day-to-day work by drastically reducing the number of biopsy cores that need to be assessed for PNI and by highlighting regions of diagnostic interest.

* 20 pages, 5 figures

Via

Access Paper or Ask Questions

Pathologist-Level Grading of Prostate Biopsies with Artificial Intelligence

Jul 02, 2019

Peter Ström, Kimmo Kartasalo, Henrik Olsson, Leslie Solorzano, Brett Delahunt, Daniel M. Berney, David G. Bostwick, Andrew J. Evans, David J. Grignon, Peter A. Humphrey(+22 more)

Figure 1 for Pathologist-Level Grading of Prostate Biopsies with Artificial Intelligence

Figure 2 for Pathologist-Level Grading of Prostate Biopsies with Artificial Intelligence

Figure 3 for Pathologist-Level Grading of Prostate Biopsies with Artificial Intelligence

Figure 4 for Pathologist-Level Grading of Prostate Biopsies with Artificial Intelligence

Abstract:Background: An increasing volume of prostate biopsies and a world-wide shortage of uro-pathologists puts a strain on pathology departments. Additionally, the high intra- and inter-observer variability in grading can result in over- and undertreatment of prostate cancer. Artificial intelligence (AI) methods may alleviate these problems by assisting pathologists to reduce workload and harmonize grading. Methods: We digitized 6,682 needle biopsies from 976 participants in the population based STHLM3 diagnostic study to train deep neural networks for assessing prostate biopsies. The networks were evaluated by predicting the presence, extent, and Gleason grade of malignant tissue for an independent test set comprising 1,631 biopsies from 245 men. We additionally evaluated grading performance on 87 biopsies individually graded by 23 experienced urological pathologists from the International Society of Urological Pathology. We assessed discriminatory performance by receiver operating characteristics (ROC) and tumor extent predictions by correlating predicted millimeter cancer length against measurements by the reporting pathologist. We quantified the concordance between grades assigned by the AI and the expert urological pathologists using Cohen's kappa. Results: The performance of the AI to detect and grade cancer in prostate needle biopsy samples was comparable to that of international experts in prostate pathology. The AI achieved an area under the ROC curve of 0.997 for distinguishing between benign and malignant biopsy cores, and 0.999 for distinguishing between men with or without prostate cancer. The correlation between millimeter cancer predicted by the AI and assigned by the reporting pathologist was 0.96. For assigning Gleason grades, the AI achieved an average pairwise kappa of 0.62. This was within the range of the corresponding values for the expert pathologists (0.60 to 0.73).

* 45 pages, 11 figures

Via

Access Paper or Ask Questions