Abstract:The availability of multi-modality datasets provides a unique opportunity to characterize the same object of interest using multiple viewpoints more comprehensively. In this work, we investigate the use of canonical correlation analysis (CCA) and penalized variants of CCA (pCCA) for the fusion of two modalities. We study a simple graphical model for the generation of two-modality data. We analytically show that, with known model parameters, posterior mean estimators that jointly use both modalities outperform arbitrary linear mixing of single modality posterior estimators in latent variable prediction. Penalized extensions of CCA (pCCA) that incorporate domain knowledge can discover correlations with high-dimensional, low-sample data, whereas traditional CCA is inapplicable. To facilitate the generation of multi-dimensional embeddings with pCCA, we propose two matrix deflation schemes that enforce desirable properties exhibited by CCA. We propose a two-stage prediction pipeline using pCCA embeddings generated with deflation for latent variable prediction by combining all the above. On simulated data, our proposed model drastically reduces the mean-squared error in latent variable prediction. When applied to publicly available histopathology data and RNA-sequencing data from The Cancer Genome Atlas (TCGA) breast cancer patients, our model can outperform principal components analysis (PCA) embeddings of the same dimension in survival prediction.
Abstract:Effective understanding of a disease such as cancer requires fusing multiple sources of information captured across physical scales by multimodal data. In this work, we propose a novel feature embedding module that derives from canonical correlation analyses to account for intra-modality and inter-modality correlations. Experiments on simulated and real data demonstrate how our proposed module can learn well-correlated multi-dimensional embeddings. These embeddings perform competitively on one-year survival classification of TCGA-BRCA breast cancer patients, yielding average F1 scores up to 58.69% under 5-fold cross-validation.
Abstract:Lung cancer has a high rate of recurrence in early-stage patients. Predicting the post-surgical recurrence in lung cancer patients has traditionally been approached using single modality information of genomics or radiology images. We investigate the potential of multimodal fusion for this task. By combining computed tomography (CT) images and genomics, we demonstrate improved prediction of recurrence using linear Cox proportional hazards models with elastic net regularization. We work on a recent non-small cell lung cancer (NSCLC) radiogenomics dataset of 130 patients and observe an increase in concordance-index values of up to 10%. Employing non-linear methods from the neural network literature, such as multi-layer perceptrons and visual-question answering fusion modules, did not improve performance consistently. This indicates the need for larger multimodal datasets and fusion techniques better adapted to this biological setting.
Abstract:Central venous catheters (CVCs) are commonly used in critical care settings for monitoring body functions and administering medications. They are often described in radiology reports by referring to their presence, identity and placement. In this paper, we address the problem of automatic detection of their presence and identity through automated segmentation using deep learning networks and classification based on their intersection with previously learned shape priors from clinician annotations of CVCs. The results not only outperform existing methods of catheter detection achieving 85.2% accuracy at 91.6% precision, but also enable high precision (95.2%) classification of catheter types on a large dataset of over 10,000 chest X-rays, presenting a robust and practical solution to this problem.