Variational Quantum Algorithms have emerged as promising tools for solving optimization problems on quantum computers. These algorithms leverage a parametric quantum circuit called ansatz, where its parameters are adjusted by a classical optimizer with the goal of optimizing a certain cost function. However, a significant challenge lies in designing effective circuits for addressing specific problems. In this study, we leverage the powerful and flexible Reinforcement Learning paradigm to train an agent capable of autonomously generating quantum circuits that can be used as ansatzes in variational algorithms to solve optimization problems. The agent is trained on diverse problem instances, including Maximum Cut, Maximum Clique and Minimum Vertex Cover, built from different graph topologies and sizes. Our analysis of the circuits generated by the agent and the corresponding solutions shows that the proposed method is able to generate effective ansatzes. While our goal is not to propose any new specific ansatz, we observe how the agent has discovered a novel family of ansatzes effective for Maximum Cut problems, which we call $R_{yz}$-connected. We study the characteristics of one of these ansatzes by comparing it against state-of-the-art quantum algorithms across instances of varying graph topologies, sizes, and problem types. Our results indicate that the $R_{yz}$-connected circuit achieves high approximation ratios for Maximum Cut problems, further validating our proposed agent. In conclusion, our study highlights the potential of Reinforcement Learning techniques in assisting researchers to design effective quantum circuits which could have applications in a wide number of tasks.