Direct loss minimization is a popular approach for learning predictors over structured label spaces. This approach is computationally appealing as it replaces integration with optimization and allows to propagate gradients in a deep net using loss-perturbed prediction. Recently, this technique was extended to generative models, while introducing a randomized predictor that samples a structure from a randomly perturbed score function. In this work, we learn the variance of these randomized structured predictors and show that it balances better between the learned score function and the randomized noise in structured prediction. We demonstrate empirically the effectiveness of learning the balance between the signal and the random noise in structured discrete spaces.