Economic model predictive control (EMPC) is a promising methodology for optimal operation of dynamical processes that has been shown to improve process economics considerably. However, EMPC performance relies heavily on the accuracy of the process model used. As an alternative to model-based control strategies, reinforcement learning (RL) has been investigated as a model-free control methodology, but issues regarding its safety and stability remain an open research challenge. This work presents a novel framework for integrating EMPC and RL for online model parameter estimation of a class of nonlinear systems. In this framework, EMPC optimally operates the closed loop system while maintaining closed loop stability and recursive feasibility. At the same time, to optimize the process, the RL agent continuously compares the measured state of the process with the model's predictions (nominal states), and modifies model parameters accordingly. The major advantage of this framework is its simplicity; state-of-the-art RL algorithms and EMPC schemes can be employed with minimal modifications. The performance of the proposed framework is illustrated on a network of reactions with challenging dynamics and practical significance. This framework allows control, optimization, and model correction to be performed online and continuously, making autonomous reactor operation more attainable.