Remote sensing (RS) image retrieval based on visual content is of great significance for geological information mining. Over the past two decades, a large amount of research on this task has been carried out, which mainly focuses on the following three core issues of image retrieval: visual feature, similarity metric and relevance feedback. Along with the advance of these issues, the technology of RS image retrieval has been developed comparatively mature. However, due to the complexity and multiformity of high-resolution remote sensing (HRRS) images, there is still room for improvement in the current methods on HRRS data retrieval. In this paper, we analyze the three key aspects of retrieval and provide a comprehensive review on content-based RS image retrieval methods. Furthermore, for the goal to advance the state-of-the-art in HRRS image retrieval, we focus on the visual feature aspect and delve how to use powerful deep representations in this task. We conduct systematic investigation on evaluating factors that may affect the performance of deep features. By optimizing each factor, we acquire remarkable retrieval results on publicly available HRRS datasets. Finally, we explain the experimental phenomenon in detail and draw instructive conclusions according to our analysis. Our work can serve as a guiding role for the research of content-based RS image retrieval.