Deep neural networks show impressive performance in medical imaging tasks. However, many current networks generalise poorly to data unseen during training, for example data generated by different medical centres. Such behaviour can be caused by networks overfitting easy-to-learn, or statistically dominant, features while disregarding other potentially informative features. Moreover, dominant features can lead to learning spurious correlations. For instance, indistinguishable differences in the sharpness of the images from two different scanners can degrade the performance of the network significantly. To address these challenges, we evaluate the utility of spectral decoupling in the context of medical image classification. Spectral decoupling encourages the neural network to learn more features by simply regularising the networks' unnormalised prediction scores with an L2 penalty. Simulation experiments show that spectral decoupling allows training neural networks on datasets with strong spurious correlations. Networks trained without spectral decoupling do not learn the original task and appear to make false predictions based on the spurious correlations. Spectral decoupling also significantly increases networks' robustness for data distribution shifts. To validate our findings, we train networks with and without spectral decoupling to detect prostate cancer on haematoxylin and eosin stained whole slide images. The networks are then evaluated with data scanned in the same medical centre with two different scanners, and data from a different centre. Networks trained with spectral decoupling increase the accuracy by 10 percentage points over weight decay on the dataset from a different medical centre. Our results show that spectral decoupling allows training robust neural networks to be used across multiple medical centres, and recommend its use in future medical imaging tasks.