Deep learning has recently been applied to automatically classify the modulation categories of received radio signals without manual experience. However, training deep learning models requires massive volume of data. An insufficient training data will cause serious overfitting problem and degrade the classification accuracy. To cope with small dataset, data augmentation has been widely used in image processing to expand the dataset and improve the robustness of deep learning models. However, in wireless communication areas, the effect of different data augmentation methods on radio modulation classification has not been studied yet. In this paper, we evaluate different data augmentation methods via a state-of-the-art deep learning-based modulation classifier. Based on the characteristics of modulated signals, three augmentation methods are considered, i.e., rotation, flip, and Gaussian noise, which can be applied in both training phase and inference phase of the deep learning algorithm. Numerical results show that all three augmentation methods can improve the classification accuracy. Among which, the rotation augmentation method outperforms the flip method, both of which achieve higher classification accuracy than the Gaussian noise method. Given only 12.5% of training dataset, a joint rotation and flip augmentation policy can achieve even higher classification accuracy than the baseline with initial 100% training dataset without augmentation. Furthermore, with data augmentation, radio modulation categories can be successfully classified using shorter radio samples, leading to a simplified deep learning model and shorter the classification response time.