In this work, we propose Wasserstein distributionally robust shallow convex neural networks (WaDiRo-SCNNs) to provide reliable nonlinear predictions when subject to adverse and corrupted datasets. Our approach is based on a new convex training program for ReLU shallow neural networks which allows us to cast the problem as an exact, tractable reformulation of its order-1 Wasserstein distributionally robust equivalent. Our training procedure is conservative by design, has low stochasticity, is solvable with open-source solvers, and is scalable to large industrial deployments. We provide out-of-sample performance guarantees and show that hard convex physical constraints can be enforced in the training program. WaDiRo-SCNN aims to make neural networks safer for critical applications, such as in the energy sector. Finally, we numerically demonstrate the performance of our model on a synthetic experiment and a real-world power system application, i.e., the prediction of non-residential buildings' hourly energy consumption. The experimental results are convincing and showcase the strengths of the proposed model.