Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:MMIST-ccRCC: A Real World Medical Dataset for the Development of Multi-Modal Systems

May 02, 2024

Tiago Mota, M. Rita Verdelho, Alceu Bissoto, Carlos Santiago, Catarina Barata

Figure 1 for MMIST-ccRCC: A Real World Medical Dataset for the Development of Multi-Modal Systems

Figure 2 for MMIST-ccRCC: A Real World Medical Dataset for the Development of Multi-Modal Systems

Figure 3 for MMIST-ccRCC: A Real World Medical Dataset for the Development of Multi-Modal Systems

Figure 4 for MMIST-ccRCC: A Real World Medical Dataset for the Development of Multi-Modal Systems

Share this with someone who'll enjoy it:

Abstract:The acquisition of different data modalities can enhance our knowledge and understanding of various diseases, paving the way for a more personalized healthcare. Thus, medicine is progressively moving towards the generation of massive amounts of multi-modal data (\emph{e.g,} molecular, radiology, and histopathology). While this may seem like an ideal environment to capitalize data-centric machine learning approaches, most methods still focus on exploring a single or a pair of modalities due to a variety of reasons: i) lack of ready to use curated datasets; ii) difficulty in identifying the best multi-modal fusion strategy; and iii) missing modalities across patients. In this paper we introduce a real world multi-modal dataset called MMIST-CCRCC that comprises 2 radiology modalities (CT and MRI), histopathology, genomics, and clinical data from 618 patients with clear cell renal cell carcinoma (ccRCC). We provide single and multi-modal (early and late fusion) benchmarks in the task of 12-month survival prediction in the challenging scenario of one or more missing modalities for each patient, with missing rates that range from 26$\%$ for genomics data to more than 90$\%$ for MRI. We show that even with such severe missing rates the fusion of modalities leads to improvements in the survival forecasting. Additionally, incorporating a strategy to generate the latent representations of the missing modalities given the available ones further improves the performance, highlighting a potential complementarity across modalities. Our dataset and code are available here: https://multi-modal-ist.github.io/datasets/ccRCC

* Accepted in DCA in MI Workshop@CVPR2024

View paper on

Share this with someone who'll enjoy it:

Title:MMIST-ccRCC: A Real World Medical Dataset for the Development of Multi-Modal Systems

Paper and Code