A novel method for learning optimal, orthonormal wavelet bases for representing 1- and 2D signals, based on parallels between the wavelet transform and fully connected artificial neural networks, is described. The structural similarities between these two concepts are reviewed and combined to a "wavenet", allowing for the direct learning of optimal wavelet filter coefficient through stochastic gradient descent with back-propagation over ensembles of training inputs, where conditions on the filter coefficients for constituting orthonormal wavelet bases are cast as quadratic regularisations terms. We describe the practical implementation of this method, and study its performance for high-energy physics collision events for QCD $2 \to 2$ processes. It is shown that an optimal solution is found, even in a high-dimensional search space, and the implications of the result are discussed.