Abstract:This work investigates how hierarchically structured data can help neural networks learn conceptual representations of cathedrals. The underlying WikiScenes dataset provides a spatially organized hierarchical structure of cathedral components. We propose a novel hierarchical contrastive training approach that leverages a triplet margin loss to represent the data's spatial hierarchy in the encoder's latent space. As such, the proposed approach investigates if the dataset structure provides valuable information for self-supervised learning. We apply t-SNE to visualize the resultant latent space and evaluate the proposed approach by comparing it with other dataset-specific contrastive learning methods using a common downstream classification task. The proposed method outperforms the comparable weakly-supervised and baseline methods. Our findings suggest that dataset structure is a valuable modality for weakly-supervised learning.
Abstract:Computational pathology uses artificial intelligence to enable precision medicine and decision support systems through the analysis of whole slide images. It has the potential to revolutionize the diagnosis and treatment of cancer. However, a major challenge to this objective is that for many specific computational pathology tasks the amount of data is inadequate for development. To address this challenge, we created Virchow, a 632 million parameter deep neural network foundation model for computational pathology. Using self-supervised learning, Virchow is trained on 1.5 million hematoxylin and eosin stained whole slide images from diverse tissue groups, which is orders of magnitude more data than previous works. When evaluated on downstream tasks including tile-level pan-cancer detection and subtyping and slide-level biomarker prediction, Virchow outperforms state-of-the-art systems both on internal datasets drawn from the same population as the pretraining data as well as external public datasets. Virchow achieves 93% balanced accuracy for pancancer tile classification, and AUCs of 0.983 for colon microsatellite instability status prediction and 0.967 for breast CDH1 status prediction. The gains in performance highlight the importance of pretraining on massive pathology image datasets, suggesting pretraining on even larger datasets could continue improving performance for many high-impact applications where limited amounts of training data are available, such as drug outcome prediction.