Crime prediction is a widely studied research problem due to its importance in ensuring safety of city dwellers. Starting from statistical and classical machine learning based crime prediction methods, in recent years researchers have focused on exploiting deep learning based models for crime prediction. Deep learning based crime prediction models use complex architectures to capture the latent features in the crime data, and outperform the statistical and classical machine learning based crime prediction methods. However, there is a significant research gap in existing research on the applicability of different models in different real-life scenarios as no longitudinal study exists comparing all these approaches in a unified setting. In this paper, we conduct a comprehensive experimental evaluation of all major state-of-the-art deep learning based crime prediction models. Our evaluation provides several key insights on the pros and cons of these models, which enables us to select the most suitable models for different application scenarios. Based on the findings, we further recommend certain design practices that should be taken into account while building future deep learning based crime prediction models.