Abstract:In large-scale traffic optimization, models based on Macroscopic Fundamental Diagram (MFD) are recognized for their efficiency in broad analyses. However, they fail to reflect variations in the individual traffic status of each road link, leading to a gap in detailed traffic optimization and analysis. To address the limitation, this study introduces a Local Correction Factor (LCF) that a function integrates MFD-derived network mean speed with network configurations to accurately estimate the individual speed of the link. We use a novel deep learning framework combining Graph Attention Networks (GATs) with Gated Recurrent Units (GRUs) to capture both spatial configurations and temporal dynamics of the network. Coupled with a strategic network partitioning method, our model enhances the precision of link-level traffic speed estimations while preserving the computational benefits of aggregate models. In the experiment, we evaluate the proposed LCF through various urban traffic scenarios, including different demand levels, origin-destination distributions, and road configurations. The results show the robust adaptability and effectiveness of the proposed model. Furthermore, we validate the practicality of our model by calculating the travel time of each randomly generated path, with the average error relative to MFD-based results being reduced to approximately 76%.
Abstract:Perimeter Control (PC) strategies have been proposed to address urban road network control in oversaturated situations by monitoring transfer flows of the Protected Network (PN). The uniform metering rate for cordon signals in existing studies ignores the variety of local traffic states at the intersection level, which may cause severe local traffic congestion and ruin the network stability. This paper introduces a semi-model dependent Multi-Agent Reinforcement Learning (MARL) framework to conduct PC with heterogeneous cordon signal behaviors. The proposed strategy integrates the MARL-based signal control method with centralized feedback PC policy and is applied to cordon signals of the PN. It operates as a two-stage system, with the feedback PC strategy detecting the overall traffic state within the PN and then distributing local instructions to cordon signals controlled by agents in the MARL framework. Each cordon signal acts independently and differently, creating a slack and distributed PC for the PN. The combination of the model-free and model-based methods is achieved by reconstructing the action-value function of the local agents with PC feedback reward without violating the integrity of the local signal control policy learned from the RL training process. Through numerical tests with different demand patterns in a microscopic traffic environment, the proposed PC strategy (a) is shown robustness, scalability, and transferability, (b) outperforms state-of-the-art model-based PC strategies in increasing network throughput, reducing cordon queue and carbon emission.