Deep learning methods are highly effective for many image reconstruction tasks. However, the performance of supervised learned models can degrade when applied to distinct experimental settings at test time or in the presence of distribution shifts. In this study, we demonstrate that pruning deep image reconstruction networks at training time can improve their robustness to distribution shifts. In particular, we consider unrolled reconstruction architectures for accelerated magnetic resonance imaging and introduce a method for pruning unrolled networks (PUN) at initialization. Our experiments demonstrate that when compared to traditional dense networks, PUN offers improved generalization across a variety of experimental settings and even slight performance gains on in-distribution data.