Spatiotemporal prediction over graphs (STPG) is crucial for transportation systems. In existing STPG models, an adjacency matrix is an important component that captures the relations among nodes over graphs. However, most studies calculate the adjacency matrix by directly memorizing the data, such as distance- and correlation-based matrices. These adjacency matrices do not consider potential pattern shift for the test data, and may result in suboptimal performance if the test data has a different distribution from the training one. This issue is known as the Out-of-Distribution generalization problem. To address this issue, in this paper we propose a Causal Adjacency Learning (CAL) method to discover causal relations over graphs. The learned causal adjacency matrix is evaluated on a downstream spatiotemporal prediction task using real-world graph data. Results demonstrate that our proposed adjacency matrix can capture the causal relations, and using our learned adjacency matrix can enhance prediction performance on the OOD test data, even though causal learning is not conducted in the downstream task.