As a promising distributed machine learning paradigm, Federated Learning (FL) enables all the involved devices to train a global model collaboratively without exposing their local data privacy. However, for non-IID scenarios, the classification accuracy of FL models decreases drastically due to the weight divergence caused by data heterogeneity. Although various FL variants have been studied to improve model accuracy, most of them still suffer from the problem of non-negligible communication and computation overhead. In this paper, we introduce a novel FL approach named Fed-Cat that can achieve high model accuracy based on our proposed device selection strategy and device concatenation-based local training method. Unlike conventional FL methods that aggregate local models trained on individual devices, FedCat periodically aggregates local models after their traversals through a series of logically concatenated devices, which can effectively alleviate the model weight divergence problem. Comprehensive experimental results on four well-known benchmarks show that our approach can significantly improve the model accuracy of state-of-the-art FL methods without causing extra communication overhead.