We present the TFF Transformer framework for the analysis of functional Magnetic Resonance Imaging (fMRI) data. TFF employs a transformer-based architecture and a two-phase training approach. First, self-supervised training is applied to a collection of fMRI scans, where the model is trained for the reconstruction of 3D volume data. Second, the pre-trained model is fine-tuned on specific tasks, utilizing ground truth labels. Our results show state-of-the-art performance on a variety of fMRI tasks, including age and gender prediction, as well as schizophrenia recognition.