Abstract:Massive multiple-input multiple-output (MIMO) enjoys great advantage in 5G wireless communication systems owing to its spectrum and energy efficiency. However, hundreds of antennas require large volumes of pilot overhead to guarantee reliable channel estimation in FDD massive MIMO system. Compressive sensing (CS) has been applied for channel estimation by exploiting the inherent sparse structure of massive MIMO channel but suffer from high complexity. To overcome this challenge, this paper develops a hybrid channel estimation scheme by integrating the model-driven CS and data-driven deep unrolling technique. The proposed scheme consists of a coarse estimation part and a fine correction part to respectively exploit the inter- and intraframe sparsities of channels to greatly reduce the pilot overhead. Theoretical result is provided to indicate the convergence of the fine correction and coarse estimation net. Simulation results are provided to verify that our scheme can estimate MIMO channels with low pilot overhead while guaranteeing estimation accuracy with relatively low complexity.
Abstract:Mobile apps, such as mHealth and wellness applications, can benefit from deep learning (DL) models trained with mobile sensing data collected by smart phones or wearable devices. However, currently there is no mobile sensing DL system that simultaneously achieves good model accuracy while adapting to user mobility behavior, scales well as the number of users increases, and protects user data privacy. We propose Zone-based Federated Learning (ZoneFL) to address these requirements. ZoneFL divides the physical space into geographical zones mapped to a mobile-edge-cloud system architecture for good model accuracy and scalability. Each zone has a federated training model, called a zone model, which adapts well to data and behaviors of users in that zone. Benefiting from the FL design, the user data privacy is protected during the ZoneFL training. We propose two novel zone-based federated training algorithms to optimize zone models to user mobility behavior: Zone Merge and Split (ZMS) and Zone Gradient Diffusion (ZGD). ZMS optimizes zone models by adapting the zone geographical partitions through merging of neighboring zones or splitting of large zones into smaller ones. Different from ZMS, ZGD maintains fixed zones and optimizes a zone model by incorporating the gradients derived from neighboring zones' data. ZGD uses a self-attention mechanism to dynamically control the impact of one zone on its neighbors. Extensive analysis and experimental results demonstrate that ZoneFL significantly outperforms traditional FL in two models for heart rate prediction and human activity recognition. In addition, we developed a ZoneFL system using Android phones and AWS cloud. The system was used in a heart rate prediction field study with 63 users for 4 months, and we demonstrated the feasibility of ZoneFL in real-life.
Abstract:Convolutional neural networks (CNNs) have gained remarkable success in recent years. However, their performance highly relies on the architecture hyperparameters, and finding proper hyperparameters for a deep CNN is a challenging optimization problem owing to its high-dimensional and computationally expensive characteristics. Given these difficulties, this study proposes a surrogate-assisted highly cooperative hyperparameter optimization (SHCHO) algorithm for chain-styled CNNs. To narrow the large search space, SHCHO first decomposes the whole CNN into several overlapping sub-CNNs in accordance with the overlapping hyperparameter interaction structure and then cooperatively optimizes these hyperparameter subsets. Two cooperation mechanisms are designed during this process. One coordinates all the sub-CNNs to reproduce the information flow in the whole CNN and achieve macro cooperation among them, and the other tackles the overlapping components by simultaneously considering the involved two sub-CNNs and facilitates micro cooperation between them. As a result, a proper hyperparameter configuration can be effectively located for the whole CNN. Besides, SHCHO also employs the well-performing surrogate technique to assist in the hyperparameter optimization of each sub-CNN, thereby greatly reducing the expensive computational cost. Extensive experimental results on two widely-used image classification datasets indicate that SHCHO can significantly improve the performance of CNNs.
Abstract:Analog in-memory computing (AIMC) -- a promising approach for energy-efficient acceleration of deep learning workloads -- computes matrix-vector multiplications (MVMs) but only approximately, due to nonidealities that often are non-deterministic or nonlinear. This can adversely impact the achievable deep neural network (DNN) inference accuracy as compared to a conventional floating point (FP) implementation. While retraining has previously been suggested to improve robustness, prior work has explored only a few DNN topologies, using disparate and overly simplified AIMC hardware models. Here, we use hardware-aware (HWA) training to systematically examine the accuracy of AIMC for multiple common artificial intelligence (AI) workloads across multiple DNN topologies, and investigate sensitivity and robustness to a broad set of nonidealities. By introducing a new and highly realistic AIMC crossbar-model, we improve significantly on earlier retraining approaches. We show that many large-scale DNNs of various topologies, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformers, can in fact be successfully retrained to show iso-accuracy on AIMC. Our results further suggest that AIMC nonidealities that add noise to the inputs or outputs, not the weights, have the largest impact on DNN accuracy, and that RNNs are particularly robust to all nonidealities.
Abstract:In frequency division duplex (FDD) massive MIMO systems, reliable downlink channel estimation is essential for the subsequent data transmission but is realized at the cost of massive pilot overhead due to hundreds of antennas at base station (BS). In order to reduce the pilot overhead without compromising the estimation, compressive sensing (CS) based methods have been widely applied for channel estimation by exploiting the inherent sparse structure of massive MIMO channel in angular domain. However, they still suffer from high complexity during optimization process and the requirement of prior knowledge on sparsity information. To overcome these challenges, this paper develops a novel hybrid channel estimation framework by integrating the model-driven CS and data-driven deep unrolling techniques. The proposed framework is composed of a coarse estimation part and a fine correction part, which is implemented in a two-stage manner to exploit both inter- and intra-frame sparsities of channels in angular domain. Then, two estimation schemes are designed depending on whether priori sparsity information is required, where the second scheme designs a new thresholding function to eliminate such requirement. Numerical results are provided to verify that our schemes can achieve high accuracy with low pilot overhead and low complexity.
Abstract:This paper presents the design, implementation, and evaluation of FLSys, a mobile-cloud federated learning (FL) system that supports deep learning models for mobile apps. FLSys is a key component toward creating an open ecosystem of FL models and apps that use these models. FLSys is designed to work with mobile sensing data collected on smart phones, balance model performance with resource consumption on the phones, tolerate phone communication failures, and achieve scalability in the cloud. In FLSys, different DL models with different FL aggregation methods in the cloud can be trained and accessed concurrently by different apps. Furthermore, FLSys provides a common API for third-party app developers to train FL models. FLSys is implemented in Android and AWS cloud. We co-designed FLSys with a human activity recognition (HAR) in the wild FL model. HAR sensing data was collected in two areas from the phones of 100+ college students during a five-month period. We implemented HAR-Wild, a CNN model tailored to mobile devices, with a data augmentation mechanism to mitigate the problem of non-Independent and Identically Distributed (non-IID) data that affects FL model training in the wild. A sentiment analysis (SA) model is used to demonstrate how FLSys effectively supports concurrent models, and it uses a dataset with 46,000+ tweets from 436 users. We conducted extensive experiments on Android phones and emulators showing that FLSys achieves good model utility and practical system performance.
Abstract:By remarkably reducing real fitness evaluations, surrogate-assisted evolutionary algorithms (SAEAs), especially hierarchical SAEAs, have been shown to be effective in solving computationally expensive optimization problems. The success of hierarchical SAEAs mainly profits from the potential benefit of their global surrogate models known as "blessing of uncertainty" and the high accuracy of local models. However, their performance leaves room for improvement on highdimensional problems since now it is still challenging to build accurate enough local models due to the huge solution space. Directing against this issue, this study proposes a new hierarchical SAEA by training local surrogate models with the help of the random projection technique. Instead of executing training in the original high-dimensional solution space, the new algorithm first randomly projects training samples onto a set of low-dimensional subspaces, then trains a surrogate model in each subspace, and finally achieves evaluations of candidate solutions by averaging the resulting models. Experimental results on six benchmark functions of 100 and 200 dimensions demonstrate that random projection can significantly improve the accuracy of local surrogate models and the new proposed hierarchical SAEA possesses an obvious edge over state-of-the-art SAEAs
Abstract:Problem decomposition plays a vital role when applying cooperative coevolution (CC) to large scale global optimization problems. However, most learning-based decomposition algorithms either only apply to additively separable problems or face the issue of false separability detections. Directing against these limitations, this study proposes a novel decomposition algorithm called surrogate-assisted variable grouping (SVG). SVG first designs a general-separability-oriented detection criterion according to whether the optimum of a variable changes with other variables. This criterion is consistent with the separability definition and thus endows SVG with broad applicability and high accuracy. To reduce the fitness evaluation requirement, SVG seeks the optimum of a variable with the help of a surrogate model rather than the original expensive high-dimensional model. Moreover, it converts the variable grouping process into a dynamic-binary-tree search one, which facilitates reutilizing historical separability detection information and thus reducing detection times. To evaluate the performance of SVG, a suite of benchmark functions with up to 2000 dimensions, including additively and non-additively separable ones, were designed. Experimental results on these functions indicate that, compared with six state-of-the-art decomposition algorithms, SVG possesses broader applicability and competitive efficiency. Furthermore, it can significantly enhance the optimization performance of CC.
Abstract:Divide-and-conquer-based (DC-based) evolutionary algorithms (EAs) have achieved notable success in dealing with large-scale optimization problems (LSOPs). However, the appealing performance of this type of algorithms generally requires a high-precision decomposition of the optimization problem, which is still a challenging task for existing decomposition methods. This study attempts to address the above issue from a different perspective and proposes an eigenspace divide-and-conquer (EDC) approach. Different from existing DC-based algorithms that perform decomposition and optimization in the original decision space, EDC first establishes an eigenspace by conducting singular value decomposition on a set of high-quality solutions selected from recent generations. Then it transforms the optimization problem into the eigenspace, and thus significantly weakens the dependencies among the corresponding eigenvariables. Accordingly, these eigenvariables can be efficiently grouped by a simple random strategy and each of the resulting subproblems can be addressed more easily by a traditional EA. To verify the efficiency of EDC, comprehensive experimental studies were conducted on two sets of benchmark functions. Experimental results indicate that EDC is robust to its parameters and has good scalability to the problem dimension. The comparison with several state-of-the-art algorithms further confirms that EDC is pretty competitive and performs better on complicated LSOPs.
Abstract:Urban traffic systems worldwide are suffering from severe traffic safety problems. Traffic safety is affected by many complex factors, and heavily related to all drivers' behaviors involved in traffic system. Drivers with aggressive driving behaviors increase the risk of traffic accidents. In order to manage the safety level of traffic system, we propose Driving Safety Credit inspired by credit score in financial security field, and design a scoring method using trajectory data and violation records. First, we extract driving habits, aggressive driving behaviors and traffic violation behaviors from driver's trajectories and traffic violation records. Next, we train a classification model to filtered out irrelevant features. And at last, we score each driver with selected features. We verify our proposed scoring method using 40 days of traffic simulation, and proves the effectiveness of our scoring method.