Abstract:Despite the recency of their conception, Generative Adversarial Networks (GANs) constitute an extensively researched machine learning sub-field for the creation of synthetic data through deep generative modeling. GANs have consequently been applied in a number of domains, most notably computer vision, in which they are typically used to generate or transform synthetic images. Given their relative ease of use, it is therefore natural that researchers in the field of networking (which has seen extensive application of deep learning methods) should take an interest in GAN-based approaches. The need for a comprehensive survey of such activity is therefore urgent. In this paper, we demonstrate how this branch of machine learning can benefit multiple aspects of computer and communication networks, including mobile networks, network analysis, internet of things, physical layer, and cybersecurity. In doing so, we shall provide a novel evaluation framework for comparing the performance of different models in non-image applications, applying this to a number of reference network datasets.
Abstract:Human-centered data collection is typically costly and implicates issues of privacy. Various solutions have been proposed in the literature to reduce this cost, such as crowdsourced data collection, or the use of semi-supervised algorithms. However, semi-supervised algorithms require a source of unlabeled data, and crowd-sourcing methods require numbers of active participants. An alternative passive data collection modality is fingerprint-based localization. Such methods use received signal strength (RSS) or channel state information (CSI) in wireless sensor networks to localize users in indoor/outdoor environments. In this paper, we introduce a novel approach to reduce training data collection costs in fingerprint-based localization by using synthetic data. Generative adversarial networks (GANs) are used to learn the distribution of a limited sample of collected data and, following this, to produce synthetic data that can be used to augment the real collected data in order to increase overall positioning accuracy. Experimental results on a benchmark dataset show that by applying the proposed method and using a combination of 10% collected data and 90% synthetic data, we can obtain essentially similar positioning accuracy to that which would be obtained by using the full set of collected data. This means that by employing GAN-generated synthetic data, we can use 90% less real data, thereby reduce data-collection costs while achieving acceptable accuracy.
Abstract:Indoor human activity recognition (HAR) explores the correlation between human body movements and the reflected WiFi signals to classify different activities. By analyzing WiFi signal patterns, especially the dynamics of channel state information (CSI), different activities can be distinguished. Gathering CSI data is expensive both from the timing and equipment perspective. In this paper, we use synthetic data to reduce the need for real measured CSI. We present a semi-supervised learning method for CSI-based activity recognition systems in which long short-term memory (LSTM) is employed to learn features and recognize seven different actions. We apply principal component analysis (PCA) on CSI amplitude data, while short-time Fourier transform (STFT) extracts the features in the frequency domain. At first, we train the LSTM network with entirely raw CSI data, which takes much more processing time. To this end, we aim to generate data by using 50% of raw data in conjunction with a generative adversarial network (GAN). Our experimental results confirm that this model can increase classification accuracy by 3.4% and reduce the Log loss by almost 16% in the considered scenario.