The integration of machine learning in medicine has significantly improved diagnostic precision, particularly in the interpretation of complex structures like the human brain. Diagnosing challenging conditions such as Alzheimer's disease has prompted the development of brain age estimation techniques. These methods often leverage three-dimensional Magnetic Resonance Imaging (MRI) scans, with recent studies emphasizing the efficacy of 3D convolutional neural networks (CNNs) like 3D ResNet. However, the untapped potential of Vision Transformers (ViTs), known for their accuracy and interpretability, persists in this domain due to limitations in their 3D versions. This paper introduces Triamese-ViT, an innovative adaptation of the ViT model for brain age estimation. Our model uniquely combines ViTs from three different orientations to capture 3D information, significantly enhancing accuracy and interpretability. Tested on a dataset of 1351 MRI scans, Triamese-ViT achieves a Mean Absolute Error (MAE) of 3.84, a 0.9 Spearman correlation coefficient with chronological age, and a -0.29 Spearman correlation coefficient between the brain age gap (BAG) and chronological age, significantly better than previous methods for brian age estimation. A key innovation of Triamese-ViT is its capacity to generate a comprehensive 3D-like attention map, synthesized from 2D attention maps of each orientation-specific ViT. This feature is particularly beneficial for in-depth brain age analysis and disease diagnosis, offering deeper insights into brain health and the mechanisms of age-related neural changes.