The Spatio-Temporal Crowd Flow Prediction (STCFP) problem is a classical problem with plenty of prior research efforts that benefit from traditional statistical learning and recent deep learning approaches. While STCFP can refer to many real-world problems, most existing studies focus on quite specific applications, such as the prediction of taxi demand, ridesharing order, and so on. This hinders the STCFP research as the approaches designed for different applications are hardly comparable, and thus how an applicationdriven approach can be generalized to other scenarios is unclear. To fill in this gap, this paper makes two efforts: (i) we propose an analytic framework, called STAnalytic, to qualitatively investigate STCFP approaches regarding their design considerations on various spatial and temporal factors, aiming to make different application-driven approaches comparable; (ii) we construct an extensively large-scale STCFP benchmark datasets with four different scenarios (including ridesharing, bikesharing, metro, and electrical vehicle charging) with up to hundreds of millions of flow records, to quantitatively measure the generalizability of STCFP approaches. Furthermore, to elaborate the effectiveness of STAnalytic in helping design generalizable STCFP approaches, we propose a spatio-temporal meta-model, called STMeta, by integrating generalizable temporal and spatial knowledge identified by STAnalytic. We implement three variants of STMeta with different deep learning techniques. With the datasets, we demonstrate that STMeta variants can outperform state-of-the-art STCFP approaches by 5%.