High-resolution tissue imaging is often compromised by sample-induced optical aberrations that degrade resolution and contrast. While wavefront sensor-based adaptive optics (AO) can measure these aberrations, such hardware solutions are typically complex, expensive to implement, and slow when serially mapping spatially varying aberrations across large fields of view. Here, we introduce AOViFT (Adaptive Optical Vision Fourier Transformer) -- a machine learning-based aberration sensing framework built around a 3D multistage Vision Transformer that operates on Fourier domain embeddings. AOViFT infers aberrations and restores diffraction-limited performance in puncta-labeled specimens with substantially reduced computational cost, training time, and memory footprint compared to conventional architectures or real-space networks. We validated AOViFT on live gene-edited zebrafish embryos, demonstrating its ability to correct spatially varying aberrations using either a deformable mirror or post-acquisition deconvolution. By eliminating the need for the guide star and wavefront sensing hardware and simplifying the experimental workflow, AOViFT lowers technical barriers for high-resolution volumetric microscopy across diverse biological samples.