Simple, short, and compact hashtags cover a wide range of information on social networks. Although many works in the field of natural language processing (NLP) have demonstrated the importance of hashtag recommendation, hashtag recommendation for images has barely been studied. In this paper, we introduce the HARRISON dataset, a benchmark on hashtag recommendation for real world images in social networks. The HARRISON dataset is a realistic dataset, composed of 57,383 photos from Instagram and an average of 4.5 associated hashtags for each photo. To evaluate our dataset, we design a baseline framework consisting of visual feature extractor based on convolutional neural network (CNN) and multi-label classifier based on neural network. Based on this framework, two single feature-based models, object-based and scene-based model, and an integrated model of them are evaluated on the HARRISON dataset. Our dataset shows that hashtag recommendation task requires a wide and contextual understanding of the situation conveyed in the image. As far as we know, this work is the first vision-only attempt at hashtag recommendation for real world images in social networks. We expect this benchmark to accelerate the advancement of hashtag recommendation.