Numerical Weather Prediction (NWP) system is an infrastructure that exerts considerable impacts on modern society.Traditional NWP system, however, resolves it by solving complex partial differential equations with a huge computing cluster, resulting in tons of carbon emission. Exploring efficient and eco-friendly solutions for NWP attracts interest from Artificial Intelligence (AI) and earth science communities. To narrow the performance gap between the AI-based methods and physic predictor, this work proposes a new transformer-based NWP framework, termed as WeatherFormer, to model the complex spatio-temporal atmosphere dynamics and empowering the capability of data-driven NWP. WeatherFormer innovatively introduces the space-time factorized transformer blocks to decrease the parameters and memory consumption, in which Position-aware Adaptive Fourier Neural Operator (PAFNO) is proposed for location sensible token mixing. Besides, two data augmentation strategies are utilized to boost the performance and decrease training consumption. Extensive experiments on WeatherBench dataset show WeatherFormer achieves superior performance over existing deep learning methods and further approaches the most advanced physical model.