Reconfigurable intelligent surface (RIS) technology is receiving significant attention as a key enabling technology for 6G communications, with much attention given to coverage infill and wireless power transfer. However, relatively little attention has been paid to the radiation pattern fidelity, for example, sidelobe suppression. When considering multi-user coverage infill, direct beam pattern synthesis using superposition can result in undesirable sidelobe levels. To address this issue, this paper introduces and applies deep reinforcement learning (DRL) as a means to optimize the far-field pattern, offering a 4dB reduction in the unwanted sidelobe levels, thereby improving energy efficiency and decreasing the co-channel interference levels.