We introduce a multi-scale energy formulation for plug and play (PnP) image recovery. The main highlight of the proposed framework is energy formulation, where the log prior of the distribution is learned by a convolutional neural network (CNN) module. The energy formulation enables us to introduce optimization algorithms with guaranteed convergence, even when the CNN module is not constrained as a contraction. Current PnP methods, which do not often have well-defined energy formulations, require a contraction constraint that restricts their performance in challenging applications. The energy and the corresponding score function are learned from reference data using denoising score matching, where the noise variance serves as a smoothness parameter that controls the shape of the learned energy function. We introduce a multi-scale optimization strategy, where a sequence of smooth approximations of the true prior is used in the optimization process. This approach improves the convergence of the algorithm to the global minimum, which translates to improved performance. The preliminary results in the context of MRI show that the multi-scale energy PnP framework offers comparable performance to unrolled algorithms. Unlike unrolled methods, the proposed PnP approach can work with arbitrary forward models, making it an easier option for clinical deployment. In addition, the training of the proposed model is more efficient from a memory and computational perspective, making it attractive in large-scale (e.g., 4D) settings.