Recent advances of deep learning makes it possible to identify specific events in videos with greater precision. This has great relevance in sports like tennis in order to e.g., automatically collect game statistics, or replay actions of specific interest for game strategy or player improvements. In this paper, we investigate the potential and the challenges of using deep learning to classify tennis actions. Three models of different size, all based on the deep learning architecture SlowFast were trained and evaluated on the academic tennis dataset THETIS. The best models achieve a generalization accuracy of 74 %, demonstrating a good performance for tennis action classification. We provide an error analysis for the best model and pinpoint directions for improvement of tennis datasets in general. We discuss the limitations of the data set, general limitations of current publicly available tennis data-sets, and future steps needed to make progress.