Abstract:Traffic flow forecasting is a crucial task in intelligent transport systems. Deep learning offers an effective solution, capturing complex patterns in time-series traffic flow data to enable the accurate prediction. However, deep learning models are prone to overfitting the intricate details of flow data, leading to poor generalisation. Recent studies suggest that decomposition-based deep ensemble learning methods may address this issue by breaking down a time series into multiple simpler signals, upon which deep learning models are built and ensembled to generate the final prediction. However, few studies have compared the performance of decomposition-based ensemble methods with non-decomposition-based ones which directly utilise raw time-series data. This work compares several decomposition-based and non-decomposition-based deep ensemble learning methods. Experimental results on three traffic datasets demonstrate the superiority of decomposition-based ensemble methods, while also revealing their sensitivity to aggregation strategies and forecasting horizons.
Abstract:Personalized Federated Learning (PFL) is widely employed in IoT applications to handle high-volume, non-iid client data while ensuring data privacy. However, heterogeneous edge devices owned by clients may impose varying degrees of resource constraints, causing computation and communication bottlenecks for PFL. Federated Dropout has emerged as a popular strategy to address this challenge, wherein only a subset of the global model, i.e. a \textit{sub-model}, is trained on a client's device, thereby reducing computation and communication overheads. Nevertheless, the dropout-based model-pruning strategy may introduce bias, particularly towards non-iid local data. When biased sub-models absorb highly divergent parameters from other clients, performance degradation becomes inevitable. In response, we propose federated learning with stochastic parameter update (FedSPU). Unlike dropout that tailors the global model to small-size local sub-models, FedSPU maintains the full model architecture on each device but randomly freezes a certain percentage of neurons in the local model during training while updating the remaining neurons. This approach ensures that a portion of the local model remains personalized, thereby enhancing the model's robustness against biased parameters from other clients. Experimental results demonstrate that FedSPU outperforms federated dropout by 7.57\% on average in terms of accuracy. Furthermore, an introduced early stopping scheme leads to a significant reduction of the training time by \(24.8\%\sim70.4\%\) while maintaining high accuracy.
Abstract:In federated learning (FL), the significant communication overhead due to the slow convergence speed of training the global model poses a great challenge. Specifically, a large number of communication rounds are required to achieve the convergence in FL. One potential solution is to employ the Newton-based optimization method for training, known for its quadratic convergence rate. However, the existing Newton-based FL training methods suffer from either memory inefficiency or high computational costs for local clients or the server. To address this issue, we propose an FL with approximated global Hessian (FAGH) method to accelerate FL training. FAGH leverages the first moment of the approximated global Hessian and the first moment of the global gradient to train the global model. By harnessing the approximated global Hessian curvature, FAGH accelerates the convergence of global model training, leading to the reduced number of communication rounds and thus the shortened training time. Experimental results verify FAGH's effectiveness in decreasing the number of communication rounds and the time required to achieve the pre-specified objectives of the global model performance in terms of training and test losses as well as test accuracy. Notably, FAGH outperforms several state-of-the-art FL training methods.
Abstract:This paper introduces a new stochastic optimization method based on the regularized Fisher information matrix (FIM), named SOFIM, which can efficiently utilize the FIM to approximate the Hessian matrix for finding Newton's gradient update in large-scale stochastic optimization of machine learning models. It can be viewed as a variant of natural gradient descent (NGD), where the challenge of storing and calculating the full FIM is addressed through making use of the regularized FIM and directly finding the gradient update direction via Sherman-Morrison matrix inversion. Additionally, like the popular Adam method, SOFIM uses the first moment of the gradient to address the issue of non-stationary objectives across mini-batches due to heterogeneous data. The utilization of the regularized FIM and Sherman-Morrison matrix inversion leads to the improved convergence rate with the same space and time complexities as stochastic gradient descent (SGD) with momentum. The extensive experiments on training deep learning models on several benchmark image classification datasets demonstrate that the proposed SOFIM outperforms SGD with momentum and several state-of-the-art Newton optimization methods, such as Nystrom-SGD, L-BFGS, and AdaHessian, in term of the convergence speed for achieving the pre-specified objectives of training and test losses as well as test accuracy.
Abstract:Traffic incident detection plays a key role in intelligent transportation systems, which has gained great attention in transport engineering. In the past, traditional machine learning (ML) based detection methods achieved good performance under a centralised computing paradigm, where all data are transmitted to a central server for building ML models therein. Nowadays, deep neural networks based federated learning (FL) has become a mainstream detection approach to enable the model training in a decentralised manner while warranting local data governance. Such neural networks-centred techniques, however, have overshadowed the utility of well-established ML-based detection methods. In this work, we aim to explore the potential of potent conventional ML-based detection models in modern traffic scenarios featured by distributed data. We leverage an elegant but less explored distributed optimisation framework named Network Lasso, with guaranteed global convergence for convex problem formulations, integrate the potent convex ML model with it, and compare it with centralised learning, local learning, and federated learning methods atop a well-known traffic incident detection dataset. Experimental results show that the proposed network lasso-based approach provides a promising alternative to the FL-based approach in data-decentralised traffic scenarios, with a strong convergence guarantee while rekindling the significance of conventional ML-based detection methods.
Abstract:Medical image segmentation (MIS) plays an instrumental role in medical image analysis, where considerable efforts have been devoted to automating the process. Currently, mainstream MIS approaches are based on deep neural networks (DNNs) which are typically trained on a dataset that contains annotation masks produced by doctors. However, in the medical domain, the annotation masks generated by different doctors can inherently vary because a doctor may unnecessarily produce precise and unique annotations to meet the goal of diagnosis. Therefore, the DNN model trained on the data annotated by certain doctors, often just a single doctor, could undesirably favour those doctors who annotate the training data, leading to the unsatisfaction of a new doctor who will use the trained model. To address this issue, this work investigates the utilization of multi-expert annotation to enhance the adaptability of the model to a new doctor and we conduct a pilot study on the MRI brain segmentation task. Experimental results demonstrate that the model trained on a dataset with multi-expert annotation can efficiently cater for a new doctor, after lightweight fine-tuning on just a few annotations from the new doctor.
Abstract:Medical image segmentation has been significantly advanced by deep learning (DL) techniques, though the data scarcity inherent in medical applications poses a great challenge to DL-based segmentation methods. Self-supervised learning offers a solution by creating auxiliary learning tasks from the available dataset and then leveraging the knowledge acquired from solving auxiliary tasks to help better solve the target segmentation task. Different auxiliary tasks may have different properties and thus can help the target task to different extents. It is desired to leverage their complementary advantages to enhance the overall assistance to the target task. To achieve this, existing methods often adopt a joint training paradigm, which co-solves segmentation and auxiliary tasks by integrating their losses or intermediate gradients. However, direct coupling of losses or intermediate gradients risks undesirable interference because the knowledge acquired from solving each auxiliary task at every training step may not always benefit the target task. To address this issue, we propose a two-stage training approach. In the first stage, the target segmentation task will be independently co-solved with each auxiliary task in both joint training and pre-training modes, with the better model selected via validation performance. In the second stage, the models obtained with respect to each auxiliary task are converted into a single model using an ensemble knowledge distillation method. Our approach allows for making best use of each auxiliary task to create multiple elite segmentation models and then combine them into an even more powerful model. We employed five auxiliary tasks of different proprieties in our approach and applied it to train the U-Net model on an X-ray pneumothorax segmentation dataset. Experimental results demonstrate the superiority of our approach over several existing methods.
Abstract:The precision of unsupervised point cloud registration methods is typically limited by the lack of reliable inlier estimation and self-supervised signal, especially in partially overlapping scenarios. In this paper, we propose an effective inlier estimation method for unsupervised point cloud registration by capturing geometric structure consistency between the source point cloud and its corresponding reference point cloud copy. Specifically, to obtain a high quality reference point cloud copy, an One-Nearest Neighborhood (1-NN) point cloud is generated by input point cloud. This facilitates matching map construction and allows for integrating dual neighborhood matching scores of 1-NN point cloud and input point cloud to improve matching confidence. Benefiting from the high quality reference copy, we argue that the neighborhood graph formed by inlier and its neighborhood should have consistency between source point cloud and its corresponding reference copy. Based on this observation, we construct transformation-invariant geometric structure representations and capture geometric structure consistency to score the inlier confidence for estimated correspondences between source point cloud and its reference copy. This strategy can simultaneously provide the reliable self-supervised signal for model optimization. Finally, we further calculate transformation estimation by the weighted SVD algorithm with the estimated correspondences and corresponding inlier confidence. We train the proposed model in an unsupervised manner, and extensive experiments on synthetic and real-world datasets illustrate the effectiveness of the proposed method.
Abstract:Physics-informed neural networks (PINNs) are a newly emerging research frontier in machine learning, which incorporate certain physical laws that govern a given data set, e.g., those described by partial differential equations (PDEs), into the training of the neural network (NN) based on such a data set. In PINNs, the NN acts as the solution approximator for the PDE while the PDE acts as the prior knowledge to guide the NN training, leading to the desired generalization performance of the NN when facing the limited availability of training data. However, training PINNs is a non-trivial task largely due to the complexity of the loss composed of both NN and physical law parts. In this work, we propose a new PINN training framework based on the multi-task optimization (MTO) paradigm. Under this framework, multiple auxiliary tasks are created and solved together with the given (main) task, where the useful knowledge from solving one task is transferred in an adaptive mode to assist in solving some other tasks, aiming to uplift the performance of solving the main task. We implement the proposed framework and apply it to train the PINN for addressing the traffic density prediction problem. Experimental results demonstrate that our proposed training framework leads to significant performance improvement in comparison to the traditional way of training the PINN.
Abstract:Traffic prediction is a flourishing research field due to its importance in human mobility in the urban space. Despite this, existing studies only focus on short-term prediction of up to few hours in advance, with most being up to one hour only. Long-term traffic prediction can enable more comprehensive, informed, and proactive measures against traffic congestion and is therefore an important task to explore. In this paper, we explore the task of long-term traffic prediction; where we predict traffic up to 24 hours in advance. We note the weaknesses of existing models--which are based on recurrent structures--for long-term traffic prediction and propose a modified Transformer model "TrafFormer". Experiments comparing our model with existing hybrid neural network models show the superiority of our model.