Modulation instability is a phenomenon of spontaneous pattern formation in nonlinear media, oftentimes leading to an unpredictable behaviour and a degradation of a signal of interest. We propose an approach based on reinforcement learning to suppress the unstable modes by optimizing the parameters for the time modulation of the potential in the nonlinear system. We test our approach in 1D and 2D cases and propose a new class of physically-meaningful reward functions to guarantee tamed instability.