https://github.com/beginner-sketch/MoSSL.
Multi-modality spatio-temporal (MoST) data extends spatio-temporal (ST) data by incorporating multiple modalities, which is prevalent in monitoring systems, encompassing diverse traffic demands and air quality assessments. Despite significant strides in ST modeling in recent years, there remains a need to emphasize harnessing the potential of information from different modalities. Robust MoST forecasting is more challenging because it possesses (i) high-dimensional and complex internal structures and (ii) dynamic heterogeneity caused by temporal, spatial, and modality variations. In this study, we propose a novel MoST learning framework via Self-Supervised Learning, namely MoSSL, which aims to uncover latent patterns from temporal, spatial, and modality perspectives while quantifying dynamic heterogeneity. Experiment results on two real-world MoST datasets verify the superiority of our approach compared with the state-of-the-art baselines. Model implementation is available at