Wind turbine power curve models translate ambient conditions into turbine power output. They are essential for energy yield prediction and turbine performance monitoring. In recent years, data-driven machine learning methods have outperformed parametric, physics-informed approaches. However, they are often criticised for being opaque "black boxes" which raises concerns regarding their robustness in non-stationary environments, such as faced by wind turbines. We, therefore, introduce an explainable artificial intelligence (XAI) framework to investigate and validate strategies learned by data-driven power curve models from operational SCADA data. It combines domain-specific considerations with Shapley Values and the latest findings from XAI for regression. Our results suggest, that learned strategies can be better indicators for model robustness than validation or test set errors. Moreover, we observe that highly complex, state-of-the-art ML models are prone to learn physically implausible strategies. Consequently, we compare several measures to ensure physically reasonable model behaviour. Lastly, we propose the utilization of XAI in the context of wind turbine performance monitoring, by disentangling environmental and technical effects that cause deviations from an expected turbine output. We hope, our work can guide domain experts towards training and selecting more transparent and robust data-driven wind turbine power curve models.