Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:MI-Gen: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide Images

Nov 27, 2023

Pingyi Chen, Honglin Li, Chenglu Zhu, Sunyi Zheng, Lin Yang

Figure 1 for MI-Gen: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide Images

Figure 2 for MI-Gen: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide Images

Figure 3 for MI-Gen: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide Images

Figure 4 for MI-Gen: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide Images

Share this with someone who'll enjoy it:

Abstract:Whole slide images are the foundation of digital pathology for the diagnosis and treatment of carcinomas. Writing pathology reports is laborious and error-prone for inexperienced pathologists. To reduce the workload and improve clinical automation, we investigate how to generate pathology reports given whole slide images. On the data end, we curated the largest WSI-text dataset (TCGA-PathoText). In specific, we collected nearly 10000 high-quality WSI-text pairs for visual-language models by recognizing and cleaning pathology reports which narrate diagnostic slides in TCGA. On the model end, we propose the multiple instance generative model (MI-Gen) which can produce pathology reports for gigapixel WSIs. We benchmark our model on the largest subset of TCGA-PathoText. Experimental results show our model can generate pathology reports which contain multiple clinical clues. Furthermore, WSI-text prediction can be seen as an approach of visual-language pre-training, which enables our model to be transferred to downstream diagnostic tasks like carcinoma grading and phenotyping. We observe that simple semantic extraction from the pathology reports can achieve the best performance (0.838 of F1 score) on BRCA subtyping without adding extra parameters or tricky fine-tuning. Our collected dataset and related code will all be publicly available.

View paper on

Share this with someone who'll enjoy it:

Title:MI-Gen: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide Images

Paper and Code