High spatial and temporal resolution across the whole brain is essential to accurately resolve neural activities in fMRI. Therefore, accelerated imaging techniques target improved coverage with high spatio-temporal resolution. Simultaneous multi-slice (SMS) imaging combined with in-plane acceleration are used in large studies that involve ultrahigh field fMRI, such as the Human Connectome Project. However, for even higher acceleration rates, these methods cannot be reliably utilized due to aliasing and noise artifacts. Deep learning (DL) reconstruction techniques have recently gained substantial interest for improving highly-accelerated MRI. Supervised learning of DL reconstructions generally requires fully-sampled training datasets, which is not available for high-resolution fMRI studies. To tackle this challenge, self-supervised learning has been proposed for training of DL reconstruction with only undersampled datasets, showing similar performance to supervised learning. In this study, we utilize a self-supervised physics-guided DL reconstruction on a 5-fold SMS and 4-fold in-plane accelerated 7T fMRI data. Our results show that our self-supervised DL reconstruction produce high-quality images at this 20-fold acceleration, substantially improving on existing methods, while showing similar functional precision and temporal effects in the subsequent analysis compared to a standard 10-fold accelerated acquisition.