The problem of Raman amplifier optimization is studied. A differentiable interpolation function is obtained for the Raman gain coefficient using machine learning (ML), which allows for the gradient descent optimization of forward-propagating Raman pumps. Both the frequency and power of an arbitrary number of pumps in a forward pumping configuration are then optimized for an arbitrary data channel load and span length. The forward propagation model is combined with an experimentally-trained ML model of a backward-pumping Raman amplifier to jointly optimize the frequency and power of the forward amplifier's pumps and the powers of the backward amplifier's pumps. The joint forward and backward amplifier optimization is demonstrated for an unrepeatered transmission of 250 km. A gain flatness of $<$ 1~dB over 4 THz is achieved. The optimized amplifiers are validated using a numerical simulator.