Abstract:Robust travel time predictions are of prime importance in managing any transportation infrastructure, and particularly in rail networks where they have major impacts both on traffic regulation and passenger satisfaction. We aim at predicting the travel time of trains on rail sections at the scale of an entire rail network in real-time, by estimating trains' delays relative to a theoretical circulation plan. Existing implementations within railway companies generally work using the approximation that a train's delay will stay constant for the rest of its trip. Predicting the evolution of a given train's delay is a uniquely hard problem, distinct from mainstream road traffic forecasting problems, since it involves several hard-to-model phenomena: train spacing, station congestion and heterogeneous rolling stock among others. We first offer empirical evidence of the previously unexplored phenomenon of delay propagation in the French National Railway Network, leading to delays being amplified by interactions between trains. We then contribute a novel technique using the transformer architecture and pre-trained embeddings to make real-time massively parallel predictions for train delays at the scale of the whole rail network (over 3k trains at peak hours, making predictions at an average horizon of 70 minutes). Our approach yields very positive results on real-world data when compared to currently-used and experimental prediction techniques. Our work is in the early stages of implementation for industrial use at the French railway company SNCF for passenger information systems, and a contender as a tool to aid traffic regulation decisions.