Transformers are vital assets for the reliable and efficient operation of power and energy systems. They support the integration of renewables to the grid through improved grid stability and operation efficiency. Monitoring the health of transformers is essential to ensure grid reliability and efficiency. Thermal insulation ageing is a key transformer failure mode, which is generally tracked by monitoring the hotspot temperature (HST). However, HST measurement is complex and expensive and often estimated from indirect measurements. Existing computationally-efficient HST models focus on space-agnostic thermal models, providing worst-case HST estimates. This article introduces an efficient spatio-temporal model for transformer winding temperature and ageing estimation, which leverages physics-based partial differential equations (PDEs) with data-driven Neural Networks (NN) in a Physics Informed Neural Networks (PINNs) configuration to improve prediction accuracy and acquire spatio-temporal resolution. The computational efficiency of the PINN model is improved through the implementation of the Residual-Based Attention scheme that accelerates the PINN model convergence. PINN based oil temperature predictions are used to estimate spatio-temporal transformer winding temperature values, which are validated through PDE resolution models and fiber optic sensor measurements, respectively. Furthermore, the spatio-temporal transformer ageing model is inferred, aiding transformer health management decision-making and providing insights into localized thermal ageing phenomena in the transformer insulation. Results are validated with a distribution transformer operated on a floating photovoltaic power plant.