The data consistency for the physical forward model is crucial in inverse problems, especially in MR imaging reconstruction. The standard way is to unroll an iterative algorithm into a neural network with a forward model embedded. The forward model always changes in clinical practice, so the learning component's entanglement with the forward model makes the reconstruction hard to generalize. The proposed method is more generalizable for different MR acquisition settings by separating the forward model from the deep learning component. The deep learning-based proximal gradient descent was proposed to create a learned regularization term independent of the forward model. We applied the one-time trained regularization term to different MR acquisition settings to validate the proposed method and compared the reconstruction with the commonly used $\ell_1$ regularization. We showed ~3 dB improvement in the peak signal to noise ratio, compared with conventional $\ell_1$ regularized reconstruction. We demonstrated the flexibility of the proposed method in choosing different undersampling patterns. We also evaluated the effect of parameter tuning for the deep learning regularization.