Sum-Product Networks (SPNs) are hierarchical probabilistic graphical models capable of fast and exact inference. Applications of SPNs to real-world data such as large image datasets has been fairly limited in previous literature. We introduce Convolutional Sum-Product Networks (ConvSPNs) which exploit the inherent structure of images in a way similar to deep convolutional neural networks, optionally with weight sharing. ConvSPNs encode spatial relationships through local products and local sum operations. ConvSPNs obtain state-of-the-art results compared to other SPN-based approaches on several visual datasets, including color images, for both generative as well as discriminative tasks. ConvSPNs are the first pure-SPN models applied to color images that do not depend on additional techniques for feature extraction. In addition, we introduce two novel methods for regularizing SPNs trained with hard EM. Both regularization methods have been motivated by observing an exponentially decreasing variance of log probabilities with respect to the depth of randomly structured SPNs. We show that our regularization provides substantial further improvements in generative visual tasks.