Off-the-shelf convolutional neural network features achieve state-of-the-art results in many image retrieval tasks. However, their invariance is pre-defined by the network architecture and training data. In this work, we propose using features aggregated from transformed images to increase the invariance of off-the-shelf features without fine-tuning or modifying the network. We learn an ensemble of beneficial image transformations through reinforcement learning in an efficient way. Experiment results show the learned ensemble of transformations is effective and transferable.