We present a method for supervised learning of sparsity-promoting regularizers for image denoising. Sparsity-promoting regularization is a key ingredient in solving modern image reconstruction problems; however, the operators underlying these regularizers are usually either designed by hand or learned from data in an unsupervised way. The recent success of supervised learning (mainly convolutional neural networks) in solving image reconstruction problems suggests that it could be a fruitful approach to designing regularizers. As a first experiment in this direction, we propose to denoise images using a variational formulation with a parametric, sparsity-promoting regularizer, where the parameters of the regularizer are learned to minimize the mean squared error of reconstructions on a training set of (ground truth image, measurement) pairs. Training involves solving a challenging bilievel optimization problem; we derive an expression for the gradient of the training loss using Karush-Kuhn-Tucker conditions and provide an accompanying gradient descent algorithm to minimize it. Our experiments on a simple synthetic, denoising problem show that the proposed method can learn an operator that outperforms well-known regularizers (total variation, DCT-sparsity, and unsupervised dictionary learning) and collaborative filtering. While the approach we present is specific to denoising, we believe that it can be adapted to the whole class of inverse problems with linear measurement models, giving it applicability to a wide range of image reconstruction problems.