Abstract:Deep learning classifiers for characterization of whole slide tissue morphology require large volumes of annotated data to learn variations across different tissue and cancer types. As is well known, manual generation of digital pathology training data is time consuming and expensive. In this paper, we propose a semi-automated method for annotating a group of similar instances at once, instead of collecting only per-instance manual annotations. This allows for a much larger training set, that reflects visual variability across multiple cancer types and thus training of a single network which can be automatically applied to each cancer type without human adjustment. We apply our method to the important task of classifying Tumor Infiltrating Lymphocytes (TILs) in H&E images. Prior approaches were trained for individual cancer types, with smaller training sets and human-in-the-loop threshold adjustment. We utilize these thresholded results as large scale "semi-automatic" annotations. Combined with existing manual annotations, our trained deep networks are able to automatically produce better TIL prediction results in 12 cancer types, compared to the human-in-the-loop approach.
Abstract:Quantitative assessment of Tumor-TIL spatial relationships is increasingly important in both basic science and clinical aspects of breast cancer research. We have developed and evaluated convolutional neural network (CNN) analysis pipelines to generate combined maps of cancer regions and tumor infiltrating lymphocytes (TILs) in routine diagnostic breast cancer whole slide tissue images (WSIs). We produce interactive whole slide maps that provide 1) insight about the structural patterns and spatial distribution of lymphocytic infiltrates and 2) facilitate improved quantification of TILs. We evaluated both tumor and TIL analyses using three CNN networks - Resnet-34, VGG16 and Inception v4, and demonstrated that the results compared favorably to those obtained by what believe are the best published methods. We have produced open-source tools and generated a public dataset consisting of tumor/TIL maps for 1,015 TCGA breast cancer images. We also present a customized web-based interface that enables easy visualization and interactive exploration of high-resolution combined Tumor-TIL maps for 1,015TCGA invasive breast cancer cases that can be downloaded for further downstream analyses.