Abstract:Efficient and precise quantification of lymphocytes in histopathology slides is imperative for the characterization of the tumor microenvironment and immunotherapy response insights. We developed a data-centric optimization pipeline that attain great lymphocyte detection performance using an off-the-shelf YOLOv5 model, without any architectural modifications. Our contribution that rely on strategic dataset augmentation strategies, includes novel biological upsampling and custom visual cohesion transformations tailored to the unique properties of tissue imagery, and enables to dramatically improve model performances. Our optimization reveals a pivotal realization: given intensive customization, standard computational pathology models can achieve high-capability biomarker development, without increasing the architectural complexity. We showcase the interest of this approach in the context of breast cancer where our strategies lead to good lymphocyte detection performances, echoing a broadly impactful paradigm shift. Furthermore, our data curation techniques enable crucial histological analysis benchmarks, highlighting improved generalizable potential.
Abstract:Deep Learning (DL) can predict biomarkers from cancer histopathology. Several clinically approved applications use this technology. Most approaches, however, predict categorical labels, whereas biomarkers are often continuous measurements. We hypothesized that regression-based DL outperforms classification-based DL. Therefore, we developed and evaluated a new self-supervised attention-based weakly supervised regression method that predicts continuous biomarkers directly from images in 11,671 patients across nine cancer types. We tested our method for multiple clinically and biologically relevant biomarkers: homologous repair deficiency (HRD) score, a clinically used pan-cancer biomarker, as well as markers of key biological processes in the tumor microenvironment. Using regression significantly enhances the accuracy of biomarker prediction, while also improving the interpretability of the results over classification. In a large cohort of colorectal cancer patients, regression-based prediction scores provide a higher prognostic value than classification-based scores. Our open-source regression approach offers a promising alternative for continuous biomarker analysis in computational pathology.