In this work, we proposed a continuous-acquisition strategy using a gradient echo (GRE) inversion recovery sequence based on spiral trajectories to simultaneously obtain the $T_1$ mapping and CINE imaging. The acquisition is using a free-breathing and ungated fashion. An approach based on variational auto-encoder(VAE) is used for the motion estimation from the centered k-space data. The motion signal is then used to train a deep manifold reconstruction algorithm for image reconstruction. Once the network is trained, we can excite the latent vectors (the estimated motion signals and the contrast signal) in any way as we wanted to generate the image frames in the time series. We can estimate the $T_1$ mapping using the generated image frames where only contrast is varying. We can also generate the breath-hold CINE in different contrast.