Most videos, including those captured through aerial remote sensing, are usually non-stationary in nature having time-varying feature statistics. Although, sophisticated reconstruction and prediction models exist for video anomaly detection, effective handling of non-stationarity has seldom been considered explicitly. In this paper, we propose to perform prediction using a time-recursive differencing network followed by autoregressive moving average estimation for video anomaly detection. The differencing network is employed to effectively handle non-stationarity in video data during the anomaly detection. Focusing on the prediction process, the effectiveness of the proposed approach is demonstrated considering a simple optical flow based video feature, and by generating qualitative and quantitative results on three aerial video datasets and two standard anomaly detection video datasets. EER, AUC and ROC curve based comparison with several existing methods including the state-of-the-art reveal the superiority of the proposed approach.