Abstract:Urban prediction tasks, such as forecasting traffic flow, temperature, and crime rates, are crucial for efficient urban planning and management. However, existing Spatiotemporal Graph Neural Networks (ST-GNNs) often rely solely on accuracy, overlooking spatial and demographic disparities in their predictions. This oversight can lead to imbalanced resource allocation and exacerbate existing inequities in urban areas. This study introduces a Residual-Aware Attention (RAA) Block and an equality-enhancing loss function to address these disparities. By adapting the adjacency matrix during training and incorporating spatial disparity metrics, our approach aims to reduce local segregation of residuals and errors. We applied our methodology to urban prediction tasks in Chicago, utilizing a travel demand dataset as an example. Our model achieved a 48% significant improvement in fairness metrics with only a 9% increase in error metrics. Spatial analysis of residual distributions revealed that models with RAA Blocks produced more equitable prediction results, particularly by reducing errors clustered in central regions. Attention maps demonstrated the model's ability to dynamically adjust focus, leading to more balanced predictions. Case studies of various community areas in Chicago further illustrated the effectiveness of our approach in addressing spatial and demographic disparities, supporting more balanced and equitable urban planning and policy-making.
Abstract:Effective traffic prediction is a cornerstone of intelligent transportation systems, enabling precise forecasts of traffic flow, speed, and congestion. While traditional spatio-temporal graph neural networks (ST-GNNs) have achieved notable success in short-term traffic forecasting, their performance in long-term predictions remains limited. This challenge arises from over-squashing problem, where bottlenecks and limited receptive fields restrict information flow and hinder the modeling of global dependencies. To address these challenges, this study introduces a novel framework that incorporates virtual nodes, which are additional nodes added to the graph and connected to existing nodes, in order to aggregate information across the entire graph within a single GNN layer. Our proposed model incorporates virtual nodes by constructing a semi-adaptive adjacency matrix. This matrix integrates distance-based and adaptive adjacency matrices, allowing the model to leverage geographical information while also learning task-specific features from data. Experimental results demonstrate that the inclusion of virtual nodes significantly enhances long-term prediction accuracy while also improving layer-wise sensitivity to mitigate the over-squashing problem. Virtual nodes also offer enhanced explainability by focusing on key intersections and high-traffic areas, as shown by the visualization of their adjacency matrix weights on road network heat maps. Our advanced approach enhances the understanding and management of urban traffic systems, making it particularly well-suited for real-world applications.
Abstract:Vision language models (VLMs) have demonstrated impressive performance across a wide range of downstream tasks. However, their proficiency in spatial reasoning remains limited, despite its crucial role in tasks involving navigation and interaction with physical environments. Specifically, much of the spatial reasoning in these tasks occurs in two-dimensional (2D) environments, and our evaluation reveals that state-of-the-art VLMs frequently generate implausible and incorrect responses to composite spatial reasoning problems, including simple pathfinding tasks that humans can solve effortlessly at a glance. To address this, we explore an effective approach to enhance 2D spatial reasoning within VLMs by training the model on basic spatial capabilities. We begin by disentangling the key components of 2D spatial reasoning: direction comprehension, distance estimation, and localization. Our central hypothesis is that mastering these basic spatial capabilities can significantly enhance a model's performance on composite spatial tasks requiring advanced spatial understanding and combinatorial problem-solving. To investigate this hypothesis, we introduce Sparkle, a framework that fine-tunes VLMs on these three basic spatial capabilities by synthetic data generation and targeted supervision to form an instruction dataset for each capability. Our experiments demonstrate that VLMs fine-tuned with Sparkle achieve significant performance gains, not only in the basic tasks themselves but also in generalizing to composite and out-of-distribution spatial reasoning tasks (e.g., improving from 13.5% to 40.0% on the shortest path problem). These findings underscore the effectiveness of mastering basic spatial capabilities in enhancing composite spatial problem-solving, offering insights for improving VLMs' spatial reasoning capabilities.
Abstract:Graph Neural Networks deliver strong classification results but often suffer from poor calibration performance, leading to overconfidence or underconfidence. This is particularly problematic in high stakes applications where accurate uncertainty estimates are essential. Existing post hoc methods, such as temperature scaling, fail to effectively utilize graph structures, while current GNN calibration methods often overlook the potential of leveraging diverse input information and model ensembles jointly. In the paper, we propose Graph Ensemble Temperature Scaling, a novel calibration framework that combines input and model ensemble strategies within a Graph Mixture of Experts archi SOTA calibration techniques, reducing expected calibration error by 25 percent across 10 GNN benchmark datasets. Additionally, GETS is computationally efficient, scalable, and capable of selecting effective input combinations for improved calibration performance.
Abstract:Quantifying uncertainty is crucial for robust and reliable predictions. However, existing spatiotemporal deep learning mostly focuses on deterministic prediction, overlooking the inherent uncertainty in such prediction. Particularly, highly-granular spatiotemporal datasets are often sparse, posing extra challenges in prediction and uncertainty quantification. To address these issues, this paper introduces a novel post-hoc Sparsity-awar Uncertainty Calibration (SAUC) framework, which calibrates uncertainty in both zero and non-zero values. To develop SAUC, we firstly modify the state-of-the-art deterministic spatiotemporal Graph Neural Networks (ST-GNNs) to probabilistic ones in the pre-calibration phase. Then we calibrate the probabilistic ST-GNNs for zero and non-zero values using quantile approaches.Through extensive experiments, we demonstrate that SAUC can effectively fit the variance of sparse data and generalize across two real-world spatiotemporal datasets at various granularities. Specifically, our empirical experiments show a 20\% reduction in calibration errors in zero entries on the sparse traffic accident and urban crime prediction. Overall, this work demonstrates the theoretical and empirical values of the SAUC framework, thus bridging a significant gap between uncertainty quantification and spatiotemporal prediction.
Abstract:Transportation mode share analysis is important to various real-world transportation tasks as it helps researchers understand the travel behaviors and choices of passengers. A typical example is the prediction of communities' travel mode share by accounting for their sociodemographics like age, income, etc., and travel modes' attributes (e.g. travel cost and time). However, there exist only limited efforts in integrating the structure of the urban built environment, e.g., road networks, into the mode share models to capture the impacts of the built environment. This task usually requires manual feature engineering or prior knowledge of the urban design features. In this study, we propose deep hybrid models (DHM), which directly combine road networks and sociodemographic features as inputs for travel mode share analysis. Using graph embedding (GE) techniques, we enhance travel demand models with a more powerful representation of urban structures. In experiments of mode share prediction in Chicago, results demonstrate that DHM can provide valuable spatial insights into the sociodemographic structure, improving the performance of travel demand models in estimating different mode shares at the city level. Specifically, DHM improves the results by more than 20\% while retaining the interpretation power of the choice models, demonstrating its superiority in interpretability, prediction accuracy, and geographical insights.
Abstract:Predicting traffic incident risks at granular spatiotemporal levels is challenging. The datasets predominantly feature zero values, indicating no incidents, with sporadic high-risk values for severe incidents. Notably, a majority of current models, especially deep learning methods, focus solely on estimating risk values, overlooking the uncertainties arising from the inherently unpredictable nature of incidents. To tackle this challenge, we introduce the Spatiotemporal Zero-Inflated Tweedie Graph Neural Networks (STZITD-GNNs). Our model merges the reliability of traditional statistical models with the flexibility of graph neural networks, aiming to precisely quantify uncertainties associated with road-level traffic incident risks. This model strategically employs a compound model from the Tweedie family, as a Poisson distribution to model risk frequency and a Gamma distribution to account for incident severity. Furthermore, a zero-inflated component helps to identify the non-incident risk scenarios. As a result, the STZITD-GNNs effectively capture the dataset's skewed distribution, placing emphasis on infrequent but impactful severe incidents. Empirical tests using real-world traffic data from London, UK, demonstrate that our model excels beyond current benchmarks. The forte of STZITD-GNN resides not only in its accuracy but also in its adeptness at curtailing uncertainties, delivering robust predictions over short (7 days) and extended (14 days) timeframes.
Abstract:Short-term demand forecasting for on-demand ride-hailing services is one of the fundamental issues in intelligent transportation systems. However, previous travel demand forecasting research predominantly focused on improving prediction accuracy, ignoring fairness issues such as systematic underestimations of travel demand in disadvantaged neighborhoods. This study investigates how to measure, evaluate, and enhance prediction fairness between disadvantaged and privileged communities in spatial-temporal demand forecasting of ride-hailing services. A two-pronged approach is taken to reduce the demand prediction bias. First, we develop a novel deep learning model architecture, named socially aware neural network (SA-Net), to integrate the socio-demographics and ridership information for fair demand prediction through an innovative socially-aware convolution operation. Second, we propose a bias-mitigation regularization method to mitigate the mean percentage prediction error gap between different groups. The experimental results, validated on the real-world Chicago Transportation Network Company (TNC) data, show that the de-biasing SA-Net can achieve better predictive performance in both prediction accuracy and fairness. Specifically, the SA-Net improves prediction accuracy for both the disadvantaged and privileged groups compared with the state-of-the-art models. When coupled with the bias mitigation regularization method, the de-biasing SA-Net effectively bridges the mean percentage prediction error gap between the disadvantaged and privileged groups, and also protects the disadvantaged regions against systematic underestimation of TNC demand. Our proposed de-biasing method can be adopted in many existing short-term travel demand estimation models, and can be utilized for various other spatial-temporal prediction tasks such as crime incidents predictions.
Abstract:Classical demand modeling analyzes travel behavior using only low-dimensional numeric data (i.e. sociodemographics and travel attributes) but not high-dimensional urban imagery. However, travel behavior depends on the factors represented by both numeric data and urban imagery, thus necessitating a synergetic framework to combine them. This study creates a theoretical framework of deep hybrid models with a crossing structure consisting of a mixing operator and a behavioral predictor, thus integrating the numeric and imagery data into a latent space. Empirically, this framework is applied to analyze travel mode choice using the MyDailyTravel Survey from Chicago as the numeric inputs and the satellite images as the imagery inputs. We found that deep hybrid models outperform both the traditional demand models and the recent deep learning in predicting the aggregate and disaggregate travel behavior with our supervision-as-mixing design. The latent space in deep hybrid models can be interpreted, because it reveals meaningful spatial and social patterns. The deep hybrid models can also generate new urban images that do not exist in reality and interpret them with economic theory, such as computing substitution patterns and social welfare changes. Overall, the deep hybrid models demonstrate the complementarity between the low-dimensional numeric and high-dimensional imagery data and between the traditional demand modeling and recent deep learning. It generalizes the latent classes and variables in classical hybrid demand models to a latent space, and leverages the computational power of deep learning for imagery while retaining the economic interpretability on the microeconomics foundation.
Abstract:Recent studies have significantly improved the prediction accuracy of travel demand using graph neural networks. However, these studies largely ignored uncertainty that inevitably exists in travel demand prediction. To fill this gap, this study proposes a framework of probabilistic graph neural networks (Prob-GNN) to quantify the spatiotemporal uncertainty of travel demand. This Prob-GNN framework is substantiated by deterministic and probabilistic assumptions, and empirically applied to the task of predicting the transit and ridesharing demand in Chicago. We found that the probabilistic assumptions (e.g. distribution tail, support) have a greater impact on uncertainty prediction than the deterministic ones (e.g. deep modules, depth). Among the family of Prob-GNNs, the GNNs with truncated Gaussian and Laplace distributions achieve the highest performance in transit and ridesharing data. Even under significant domain shifts, Prob-GNNs can predict the ridership uncertainty in a stable manner, when the models are trained on pre-COVID data and tested across multiple periods during and after the COVID-19 pandemic. Prob-GNNs also reveal the spatiotemporal pattern of uncertainty, which is concentrated on the afternoon peak hours and the areas with large travel volumes. Overall, our findings highlight the importance of incorporating randomness into deep learning for spatiotemporal ridership prediction. Future research should continue to investigate versatile probabilistic assumptions to capture behavioral randomness, and further develop methods to quantify uncertainty to build resilient cities.