Abstract:Public transport systems are expected to reduce pollution and contribute to sustainable development. However, disruptions in public transport such as delays may negatively affect mobility choices. To quantify delays, aggregated data from vehicle locations systems are frequently used. However, delays observed at individual stops are caused inter alia by fluctuations in running times and propagation of delays occurring in other locations. Hence, in this work, we propose both the method detecting significant delays and reference architecture, relying on stream processing engines, in which the method is implemented. The method can complement the calculation of delays defined as deviation from schedules. This provides both online rather than batch identification of significant and repetitive delays, and resilience to the limited quality of location data. The method we propose can be used with different change detectors, such as ADWIN, applied to location data stream shuffled to individual edges of a transport graph. It can detect in an online manner at which edges statistically significant delays are observed and at which edges delays arise and are reduced. Detections can be used to model mobility choices and quantify the impact of repetitive rather than random disruptions on feasible trips with multimodal trip modelling engines. The evaluation performed with the public transport data of over 2000 vehicles confirms the merits of the method and reveals that a limited-size subgraph of a transport system graph causes statistically significant delays