This paper presents a generic probabilistic framework for estimating the statistical dependency and finding the anatomical correspondences among an arbitrary number of medical images. The method builds on a novel formulation of the $N$-dimensional joint intensity distribution by representing the common anatomy as latent variables and estimating the appearance model with nonparametric estimators. Through connection to maximum likelihood and the expectation-maximization algorithm, an information\hyp{}theoretic metric called $\mathcal{X}$-metric and a co-registration algorithm named $\mathcal{X}$-CoReg are induced, allowing groupwise registration of the $N$ observed images with computational complexity of $\mathcal{O}(N)$. Moreover, the method naturally extends for a weakly-supervised scenario where anatomical labels of certain images are provided. This leads to a combined\hyp{}computing framework implemented with deep learning, which performs registration and segmentation simultaneously and collaboratively in an end-to-end fashion. Extensive experiments were conducted to demonstrate the versatility and applicability of our model, including multimodal groupwise registration, motion correction for dynamic contrast enhanced magnetic resonance images, and deep combined computing for multimodal medical images. Results show the superiority of our method in various applications in terms of both accuracy and efficiency, highlighting the advantage of the proposed representation of the imaging process.