Traffic flow forecasting (TFF) is of great importance to the construction of Intelligent Transportation Systems (ITS). To mitigate communication burden and tackle with the problem of privacy leakage aroused by centralized forecasting methods, Federated Learning (FL) has been applied to TFF. However, existing FL-based approaches employ batch learning manner, which makes the pre-trained models inapplicable to subsequent traffic data, thus exhibiting subpar prediction performance. In this paper, we perform the first study of forecasting traffic flow adopting Online Learning (OL) manner in FL framework and then propose a novel prediction method named Online Spatio-Temporal Correlation-based Federated Learning (FedOSTC), aiming to guarantee performance gains regardless of traffic fluctuation. Specifically, clients employ Gated Recurrent Unit (GRU)-based encoders to obtain the internal temporal patterns inside traffic data sequences. Then, the central server evaluates spatial correlation among clients via Graph Attention Network (GAT), catering to the dynamic changes of spatial closeness caused by traffic fluctuation. Furthermore, to improve the generalization of the global model for upcoming traffic data, a period-aware aggregation mechanism is proposed to aggregate the local models which are optimized using Online Gradient Descent (OGD) algorithm at clients. We perform comprehensive experiments on two real-world datasets to validate the efficiency and effectiveness of our proposed method and the numerical results demonstrate the superiority of FedOSTC.