The dictionary learning problem, representing data as a combination of few atoms, has long stood as a popular method for learning representations in statistics and signal processing. The most popular dictionary learning algorithm alternates between sparse coding and dictionary update steps, and a rich literature has studied its theoretical convergence. The growing popularity of neurally plausible unfolded sparse coding networks has led to the empirical finding that backpropagation through such networks performs dictionary learning. This paper offers the first theoretical proof for these empirical results through PUDLE, a Provable Unfolded Dictionary LEarning method. We highlight the impact of loss, unfolding, and backpropagation on convergence. We discover an implicit acceleration: as a function of unfolding, the backpropagated gradient converges faster and is more accurate than the gradient from alternating minimization. We complement our findings through synthetic and image denoising experiments. The findings support the use of accelerated deep learning optimizers and unfolded networks for dictionary learning.