Abstract:When considering the spatial and temporal features of traffic, capturing the impacts of various external factors on travel is an important step towards achieving accurate traffic forecasting. The impacts of external factors on the traffic flow have complex correlations. However, existing studies seldom consider external factors or neglecting the effect of the complex correlations among external factors on traffic. Intuitively, knowledge graphs can naturally describe these correlations, but knowledge graphs and traffic networks are essentially heterogeneous networks; thus, it is a challenging problem to integrate the information in both networks. We propose a knowledge representation-driven traffic forecasting method based on spatiotemporal graph convolutional networks. We first construct a city knowledge graph for traffic forecasting, then use KS-Cells to combine the information from the knowledge graph and the traffic network, and finally, capture the temporal changes of the traffic state with GRU. Testing on real-world datasets shows that the KST-GCN has higher accuracy than the baseline traffic forecasting methods at various prediction horizons. We provide a new way to integrate knowledge and the spatiotemporal features of data for traffic forecasting tasks. Without any loss of generality, the proposed method can also be extended to other spatiotemporal forecasting tasks.
Abstract:Agglomerative hierarchical clustering (AHC) is one of the popular clustering approaches. Existing AHC methods, which are based on a distance measure, have one key issue: it has difficulty in identifying adjacent clusters with varied densities, regardless of the cluster extraction methods applied on the resultant dendrogram. In this paper, we identify the root cause of this issue and show that the use of a data-dependent kernel (instead of distance or existing kernel) provides an effective means to address it. We analyse the condition under which existing AHC methods fail to extract clusters effectively; and the reason why the data-dependent kernel is an effective remedy. This leads to a new approach to kernerlise existing hierarchical clustering algorithms such as existing traditional AHC algorithms, HDBSCAN, GDL and PHA. In each of these algorithms, our empirical evaluation shows that a recently introduced Isolation Kernel produces a higher quality or purer dendrogram than distance, Gaussian Kernel and adaptive Gaussian Kernel.