Magnetic Resonance Imaging (MRI) is a powerful medical imaging modality, but unfortunately suffers from long scan times which, aside from increasing operational costs, can lead to image artifacts due to patient motion. Motion during the acquisition leads to inconsistencies in measured data that manifest as blurring and ghosting if unaccounted for in the image reconstruction process. Various deep learning based reconstruction techniques have been proposed which decrease scan time by reducing the number of measurements needed for a high fidelity reconstructed image. Additionally, deep learning has been used to correct motion using end-to-end techniques. This, however, increases susceptibility to distribution shifts at test time (sampling pattern, motion level). In this work we propose a framework for jointly reconstructing highly sub-sampled MRI data while estimating patient motion using score-based generative models. Our method does not make specific assumptions on the sampling trajectory or motion pattern at training time and thus can be flexibly applied to various types of measurement models and patient motion. We demonstrate our framework on retrospectively accelerated 2D brain MRI corrupted by rigid motion.