Abstract:With the prevailing of mobility as a service (MaaS), it becomes increasingly important to manage multi-traffic modes simultaneously and cooperatively. As an important component of MaaS, short-term passenger flow prediction for multi-traffic modes has thus been brought into focus. It is a challenging problem because the spatiotemporal features of multi-traffic modes are critically complex. To solve the problem, this paper proposes a multi-task learning-based model, called Res-Transformer, for short-term passenger flow prediction of multi-traffic modes (subway, taxi, and bus). Each traffic mode is treated as a single task in the model. The Res-Transformer consists of three parts: (1) several modified transformer layers comprising 2D convolutional neural networks (CNN) and multi-head attention mechanism, which helps to extract the spatial and temporal features of multi-traffic modes, (2) a residual network architecture used to extract the inner pattern of different traffic modes and enhance the passenger flow features of multi-traffic modes. The Res-Transformer model is evaluated on two large-scale real-world datasets from Beijing, China. One is the region of a traffic hub and the other is the region of a residential area. Experiments are conducted to compare the performance of the proposed model with several state-of-the-art models to prove the effectiveness and robustness of the proposed method. This paper can give critical insights into the short-tern passenger flow prediction for multi-traffic modes.