Training models on spatio-temporal (ST) data poses an open problem due to the complicated and diverse nature of the data itself, and it is challenging to ensure the model's performance directly trained on the original ST data. While limiting the variety of training data can make training easier, it can also lead to a lack of knowledge and information for the model, resulting in a decrease in performance. To address this challenge, we presented an innovative paradigm that incorporates three separate forms of curriculum learning specifically targeting from spatial, temporal, and quantile perspectives. Furthermore, our framework incorporates a stacking fusion module to combine diverse information from three types of curriculum learning, resulting in a strong and thorough learning process. We demonstrated the effectiveness of this framework with extensive empirical evaluations, highlighting its better performance in addressing complex ST challenges. We provided thorough ablation studies to investigate the effectiveness of our curriculum and to explain how it contributes to the improvement of learning efficiency on ST data.