Abstract:This study evaluates the generalisation capabilities of state-of-the-art histopathology foundation models on out-of-distribution multi-stain autoimmune Immunohistochemistry datasets. We compare 13 feature extractor models, including ImageNet-pretrained networks, and histopathology foundation models trained on both public and proprietary data, on Rheumatoid Arthritis subtyping and Sjogren's Disease detection tasks. Using a simple Attention-Based Multiple Instance Learning classifier, we assess the transferability of learned representations from cancer H&E images to autoimmune IHC images. Contrary to expectations, histopathology-pretrained models did not significantly outperform ImageNet-pretrained models. Furthermore, there was evidence of both autoimmune feature misinterpretation and biased feature importance. Our findings highlight the challenges in transferring knowledge from cancer to autoimmune histopathology and emphasise the need for careful evaluation of AI models across diverse histopathological tasks. The code to run this benchmark is available at https://github.com/AmayaGS/ImmunoHistoBench.
Abstract:Rheumatoid Arthritis (RA) is a chronic, autoimmune disease which primarily affects the joint's synovial tissue. It is a highly heterogeneous disease, with wide cellular and molecular variability observed in synovial tissues. Over the last two decades, the methods available for their study have advanced considerably. In particular, Immunohistochemistry stains are well suited to highlighting the functional organisation of samples. Yet, analysis of IHC-stained synovial tissue samples is still overwhelmingly done manually and semi-quantitatively by expert pathologists. This is because in addition to the fragmented nature of IHC stained synovial tissue, there exist wide variations in intensity and colour, strong clinical centre batch effect, as well as the presence of many undesirable artefacts present in gigapixel Whole Slide Images (WSIs), such as water droplets, pen annotation, folded tissue, blurriness, etc. There is therefore a strong need for a robust, repeatable automated tissue segmentation algorithm which can cope with this variability and provide support to imaging pipelines. We train a UNET on a hand-curated, heterogeneous real-world multi-centre clinical dataset R4RA, which contains multiple types of IHC staining. The model obtains a DICE score of 0.865 and successfully segments different types of IHC staining, as well as dealing with variance in colours, intensity and common WSIs artefacts from the different clinical centres. It can be used as the first step in an automated image analysis pipeline for synovial tissue samples stained with IHC, increasing speed, reproducibility and robustness.