Purpose : Because functional MRI (fMRI) data sets are in general small, we sought a data efficient approach to resting state fMRI classification of autism spectrum disorder (ASD) versus neurotypical (NT) controls. We hypothesized that a Deep Reinforcement Learning (DRL) classifier could learn effectively on a small fMRI training set. Methods : We trained a Deep Reinforcement Learning (DRL) classifier on 100 graph-label pairs from the Autism Brain Imaging Data Exchange (ABIDE) database. For comparison, we trained a Supervised Deep Learning (SDL) classifier on the same training set. Results : DRL significantly outperformed SDL, with a p-value of 2.4 x 10^(-7). DRL achieved superior results for a variety of classifier performance metrics, including an F1 score of 76, versus 67 for SDL. Whereas SDL quickly overfit the training data, DRL learned in a progressive manner that generalised to the separate testing set. Conclusion : DRL can learn to classify ASD versus NT in a data efficient manner, doing so for a small training set. Future work will involve optimizing the neural network for data efficiency and applying the approach to other fMRI data sets, namely for brain cancer patients.