Abstract:Artificial intelligence (AI) plays a crucial role in autonomous driving (AD) research, propelling its development towards intelligence and efficiency. Currently, the development of AD technology follows two main technical paths: modularization and end-to-end. Modularization decompose the driving task into modules such as perception, prediction, planning, and control, and train them separately. Due to the inconsistency of training objectives between modules, the integrated effect suffers from bias. End-to-end attempts to address this issue by utilizing a single model that directly maps from sensor data to control signals. This path has limited learning capabilities in a comprehensive set of features and struggles to handle unpredictable long-tail events and complex urban traffic scenarios. In the face of challenges encountered in both paths, many researchers believe that large language models (LLMs) with powerful reasoning capabilities and extensive knowledge understanding may be the solution, expecting LLMs to provide AD systems with deeper levels of understanding and decision-making capabilities. In light of the challenges faced by both paths, many researchers believe that LLMs, with their powerful reasoning abilities and extensive knowledge, could offer a solution. To understand if LLMs could enhance AD, this paper conducts a thorough analysis of the potential applications of LLMs in AD systems, including exploring their optimization strategies in both modular and end-to-end approaches, with a particular focus on how LLMs can tackle the problems and challenges present in current solutions. Furthermore, we discuss an important question: Can LLM-based artificial general intelligence (AGI) be a key to achieve high-level AD? We further analyze the potential limitations and challenges that LLMs may encounter in promoting the development of AD technology.
Abstract:Large-scale data missing is a challenging problem in Intelligent Transportation Systems (ITS). Many studies have been carried out to impute large-scale traffic data by considering their spatiotemporal correlations at a network level. In existing traffic data imputations, however, rich semantic information of a road network has been largely ignored when capturing network-wide spatiotemporal correlations. This study proposes a Graph Transformer for Traffic Data Imputation (GT-TDI) model to impute large-scale traffic data with spatiotemporal semantic understanding of a road network. Specifically, the proposed model introduces semantic descriptions consisting of network-wide spatial and temporal information of traffic data to help the GT-TDI model capture spatiotemporal correlations at a network level. The proposed model takes incomplete data, the social connectivity of sensors, and semantic descriptions as input to perform imputation tasks with the help of Graph Neural Networks (GNN) and Transformer. On the PeMS freeway dataset, extensive experiments are conducted to compare the proposed GT-TDI model with conventional methods, tensor factorization methods, and deep learning-based methods. The results show that the proposed GT-TDI outperforms existing methods in complex missing patterns and diverse missing rates. The code of the GT-TDI model will be available at https://github.com/KP-Zhang/GT-TDI.
Abstract:A time-space traffic (TS) diagram that presents traffic states in time-space cells with colors is one of the most important traffic analysis and visualization tools. Despite its importance for transportation research and engineering, most TS diagrams that have already existed or are being produced are too coarse to exhibit detailed traffic dynamics due to the limitation of the current information technology and traffic infrastructure investment. To increase the resolution of a TS diagram and make it present more traffic details, this paper introduces a TS diagram refinement problem and proposes a multiple linear regression-based model to solve the problem. Two tests, which attempt to increase the resolution of a TS diagram for 4 and 16 times, respectively, are carried out to evaluate the performance of the proposed model. The data collected from different time, different location and even different country is involved to thoroughly evaluate the accuracy and transferability of the proposed model. The strict tests with diverse data show that the proposed model, although it is simple in form, is able to refine a TS diagram with a promising accuracy and reliable transferability. The proposed refinement model will "save" those widely-existing TS diagrams from their blurry "faces" and make it possible to learn more traffic details from those TS diagrams.
Abstract:This study proposes a novel Graph Convolutional Neural Network with Data-driven Graph Filter (GCNN-DDGF) model that can learn hidden heterogeneous pairwise correlations between stations to predict station-level hourly demand in a large-scale bike-sharing network. Two architectures of the GCNN-DDGF model are explored; GCNNreg-DDGF is a regular GCNN-DDGF model which contains the convolution and feedforward blocks, and GCNNrec-DDGF additionally contains a recurrent block from the Long Short-term Memory neural network architecture to capture temporal dependencies in the bike-sharing demand series. Furthermore, four types of GCNN models are proposed whose adjacency matrices are based on various bike-sharing system data, including Spatial Distance matrix (SD), Demand matrix (DE), Average Trip Duration matrix (ATD), and Demand Correlation matrix (DC). These six types of GCNN models and seven other benchmark models are built and compared on a Citi Bike dataset from New York City which includes 272 stations and over 28 million transactions from 2013 to 2016. Results show that the GCNNrec-DDGF performs the best in terms of the Root Mean Square Error, the Mean Absolute Error and the coefficient of determination (R2), followed by the GCNNreg-DDGF. They outperform the other models. Through a more detailed graph network analysis based on the learned DDGF, insights are obtained on the black box of the GCNN-DDGF model. It is found to capture some information similar to details embedded in the SD, DE and DC matrices. More importantly, it also uncovers hidden heterogeneous pairwise correlations between stations that are not revealed by any of those matrices.
Abstract:This paper proposes a convolutional neural network (CNN)-based method that learns traffic as images and predicts large-scale, network-wide traffic speed with a high accuracy. Spatiotemporal traffic dynamics are converted to images describing the time and space relations of traffic flow via a two-dimensional time-space matrix. A CNN is applied to the image following two consecutive steps: abstract traffic feature extraction and network-wide traffic speed prediction. The effectiveness of the proposed method is evaluated by taking two real-world transportation networks, the second ring road and north-east transportation network in Beijing, as examples, and comparing the method with four prevailing algorithms, namely, ordinary least squares, k-nearest neighbors, artificial neural network, and random forest, and three deep learning architectures, namely, stacked autoencoder, recurrent neural network, and long-short-term memory network. The results show that the proposed method outperforms other algorithms by an average accuracy improvement of 42.91% within an acceptable execution time. The CNN can train the model in a reasonable time and, thus, is suitable for large-scale transportation networks.