In cardiac CINE, motion-compensated MR reconstruction (MCMR) is an effective approach to address highly undersampled acquisitions by incorporating motion information between frames. In this work, we propose a deep learning-based framework to address the MCMR problem efficiently. Contrary to state-of-the-art (SOTA) MCMR methods which break the original problem into two sub-optimization problems, i.e. motion estimation and reconstruction, we formulate this problem as a single entity with one single optimization. We discard the canonical motion-warping loss (similarity measurement between motion-warped images and target images) to estimate the motion, but drive the motion estimation process directly by the final reconstruction performance. The higher reconstruction quality is achieved without using any smoothness loss terms and without iterative processing between motion estimation and reconstruction. Therefore, we avoid non-trivial loss weighting factors tuning and time-consuming iterative processing. Experiments on 43 in-house acquired 2D CINE datasets indicate that the proposed MCMR framework can deliver artifact-free motion estimation and high-quality MR images even for imaging accelerations up to 20x. The proposed framework is compared to SOTA non-MCMR and MCMR methods and outperforms these methods qualitatively and quantitatively in all applied metrics across all experiments with different acceleration rates.