The accurate and efficient diagnosis of COVID-19 is of paramount importance, particularly in the context of large-scale medical imaging datasets. In this preprint paper, we propose a novel approach for COVID-19 diagnosis using CT images that leverages the power of Swin Transformer models, state-of-the-art solutions in computer vision tasks. Our method includes a systematic approach for patient-level predictions, where individual CT slices are classified as COVID-19 or non-COVID, and the patient's overall diagnosis is determined through majority voting. The application of the Swin Transformer in this context results in patient-level predictions that demonstrate exceptional diagnostic accuracy. In terms of evaluation metrics, our approach consistently outperforms the baseline, as well as numerous competing methods, showcasing its effectiveness in COVID-19 diagnosis. The macro F1 score achieved by our model exceeds the baseline and offers a robust solution for accurate diagnosis.