Single-molecule localization microscopy constructs super-resolution images by the sequential imaging and computational localization of sparsely activated fluorophores. Accurate and efficient fluorophore localization algorithms are key to the success of this computational microscopy method. We present a novel localization algorithm based on deep learning which significantly improves upon the state of the art. Our contributions are a novel network architecture for simultaneous detection and localization, and a new training algorithm which enables this deep network to solve the Bayesian inverse problem of detecting and localizing single molecules. Our network architecture uses temporal context from multiple sequentially imaged frames to detect and localize molecules. Our training algorithm combines simulation-based supervised learning with autoencoder-based unsupervised learning to make it more robust against mismatch in the generative model. We demonstrate the performance of our method on datasets imaged using a variety of point spread functions and fluorophore densities. While existing localization algorithms can achieve optimal localization accuracy in data with low fluorophore density, they are confounded by high densities. Our method significantly outperforms the state of the art at high densities and thus, enables faster imaging than previous approaches. Our work also more generally shows how to train deep networks to solve challenging Bayesian inverse problems in biology and physics.