Abstract:With the rising demand for wireless services and increased awareness of the need for data protection, existing network traffic analysis and management architectures are facing unprecedented challenges in classifying and synthesizing the increasingly diverse services and applications. This paper proposes FS-GAN, a federated self-supervised learning framework to support automatic traffic analysis and synthesis over a large number of heterogeneous datasets. FS-GAN is composed of multiple distributed Generative Adversarial Networks (GANs), with a set of generators, each being designed to generate synthesized data samples following the distribution of an individual service traffic, and each discriminator being trained to differentiate the synthesized data samples and the real data samples of a local dataset. A federated learning-based framework is adopted to coordinate local model training processes of different GANs across different datasets. FS-GAN can classify data of unknown types of service and create synthetic samples that capture the traffic distribution of the unknown types. We prove that FS-GAN can minimize the Jensen-Shannon Divergence (JSD) between the distribution of real data across all the datasets and that of the synthesized data samples. FS-GAN also maximizes the JSD among the distributions of data samples created by different generators, resulting in each generator producing synthetic data samples that follow the same distribution as one particular service type. Extensive simulation results show that the classification accuracy of FS-GAN achieves over 20% improvement in average compared to the state-of-the-art clustering-based traffic analysis algorithms. FS-GAN also has the capability to synthesize highly complex mixtures of traffic types without requiring any human-labeled data samples.
Abstract:With the fast growing demand on new services and applications as well as the increasing awareness of data protection, traditional centralized traffic classification approaches are facing unprecedented challenges. This paper introduces a novel framework, Federated Generative Adversarial Networks and Automatic Classification (FGAN-AC), which integrates decentralized data synthesizing with traffic classification. FGAN-AC is able to synthesize and classify multiple types of service data traffic from decentralized local datasets without requiring a large volume of manually labeled dataset or causing any data leakage. Two types of data synthesizing approaches have been proposed and compared: computation-efficient FGAN (FGAN-\uppercase\expandafter{\romannumeral1}) and communication-efficient FGAN (FGAN-\uppercase\expandafter{\romannumeral2}). The former only implements a single CNN model for processing each local dataset and the later only requires coordination of intermediate model training parameters. An automatic data classification and model updating framework has been proposed to automatically identify unknown traffic from the synthesized data samples and create new pseudo-labels for model training. Numerical results show that our proposed framework has the ability to synthesize highly mixed service data traffic and can significantly improve the traffic classification performance compared to existing solutions.
Abstract:Spatio-temporal modeling of wireless access latency is of great importance for connected-vehicular systems. The quality of the molded results rely heavily on the number and quality of samples which can vary significantly due to the sensor deployment density as well as traffic volume and density. This paper proposes LaMI (Latency Model Inpainting), a novel framework to generate a comprehensive spatio-temporal of wireless access latency of a connected vehicles across a wide geographical area. LaMI adopts the idea from image inpainting and synthesizing and can reconstruct the missing latency samples by a two-step procedure. In particular, it first discovers the spatial correlation between samples collected in various regions using a patching-based approach and then feeds the original and highly correlated samples into a Variational Autoencoder (VAE), a deep generative model, to create latency samples with similar probability distribution with the original samples. Finally, LaMI establishes the empirical PDF of latency performance and maps the PDFs into the confidence levels of different vehicular service requirements. Extensive performance evaluation has been conducted using the real traces collected in a commercial LTE network in a university campus. Simulation results show that our proposed model can significantly improve the accuracy of latency modeling especially compared to existing popular solutions such as interpolation and nearest neighbor-based methods.