Abstract:Accurately predicting the destination of taxi trajectories can have various benefits for intelligent location-based services. One potential method to accomplish this prediction is by converting the taxi trajectory into a two-dimensional grid and using computer vision techniques. While the Swin Transformer is an innovative computer vision architecture with demonstrated success in vision downstream tasks, it is not commonly used to solve real-world trajectory problems. In this paper, we propose a simplified Swin Transformer (SST) structure that does not use the shifted window idea in the traditional Swin Transformer, as trajectory data is consecutive in nature. Our comprehensive experiments, based on real trajectory data, demonstrate that SST can achieve higher accuracy compared to state-of-the-art methods.