Abstract:Beam management is an important technique to improve signal strength and reduce interference in wireless communication systems. Recently, there has been increasing interest in using diverse sensing modalities for beam management. However, it remains a big challenge to process multi-modal data efficiently and extract useful information. On the other hand, the recently emerging multi-modal transformer (MMT) is a promising technique that can process multi-modal data by capturing long-range dependencies. While MMT is highly effective in handling multi-modal data and providing robust beam management, integrating reinforcement learning (RL) further enhances their adaptability in dynamic environments. In this work, we propose a two-step beam management method by combining MMT with RL for dynamic beam index prediction. In the first step, we divide available beam indices into several groups and leverage MMT to process diverse data modalities to predict the optimal beam group. In the second step, we employ RL for fast beam decision-making within each group, which in return maximizes throughput. Our proposed framework is tested on a 6G dataset. In this testing scenario, it achieves higher beam prediction accuracy and system throughput compared to both the MMT-only based method and the RL-only based method.
Abstract:In recent years, machine learning (ML) techniques have created numerous opportunities for intelligent mobile networks and have accelerated the automation of network operations. However, complex network tasks may involve variables and considerations even beyond the capacity of traditional ML algorithms. On the other hand, large language models (LLMs) have recently emerged, demonstrating near-human-level performance in cognitive tasks across various fields. However, they remain prone to hallucinations and often lack common sense in basic tasks. Therefore, they are regarded as assistive tools for humans. In this work, we propose the concept of "generative AI-in-the-loop" and utilize the semantic understanding, context awareness, and reasoning abilities of LLMs to assist humans in handling complex or unforeseen situations in mobile communication networks. We believe that combining LLMs and ML models allows both to leverage their respective capabilities and achieve better results than either model alone. To support this idea, we begin by analyzing the capabilities of LLMs and compare them with traditional ML algorithms. We then explore potential LLM-based applications in line with the requirements of next-generation networks. We further examine the integration of ML and LLMs, discussing how they can be used together in mobile networks. Unlike existing studies, our research emphasizes the fusion of LLMs with traditional ML-driven next-generation networks and serves as a comprehensive refinement of existing surveys. Finally, we provide a case study to enhance ML-based network intrusion detection with synthesized data generated by LLMs. Our case study further demonstrates the advantages of our proposed idea.
Abstract:Large language models (LLMs), especially generative pre-trained transformers (GPTs), have recently demonstrated outstanding ability in information comprehension and problem-solving. This has motivated many studies in applying LLMs to wireless communication networks. In this paper, we propose a pre-trained LLM-empowered framework to perform fully automatic network intrusion detection. Three in-context learning methods are designed and compared to enhance the performance of LLMs. With experiments on a real network intrusion detection dataset, in-context learning proves to be highly beneficial in improving the task processing performance in a way that no further training or fine-tuning of LLMs is required. We show that for GPT-4, testing accuracy and F1-Score can be improved by 90%. Moreover, pre-trained LLMs demonstrate big potential in performing wireless communication-related tasks. Specifically, the proposed framework can reach an accuracy and F1-Score of over 95% on different types of attacks with GPT-4 using only 10 in-context learning examples.
Abstract:The Internet of Senses (IoS) holds the promise of flawless telepresence-style communication for all human `receptors' and therefore blurs the difference of virtual and real environments. We commence by highlighting the compelling use cases empowered by the IoS and also the key network requirements. We then elaborate on how the emerging semantic communications and Artificial Intelligence (AI)/Machine Learning (ML) paradigms along with 6G technologies may satisfy the requirements of IoS use cases. On one hand, semantic communications can be applied for extracting meaningful and significant information and hence efficiently exploit the resources and for harnessing a priori information at the receiver to satisfy IoS requirements. On the other hand, AI/ML facilitates frugal network resource management by making use of the enormous amount of data generated in IoS edge nodes and devices, as well as by optimizing the IoS performance via intelligent agents. However, the intelligent agents deployed at the edge are not completely aware of each others' decisions and the environments of each other, hence they operate in a partially rather than fully observable environment. Therefore, we present a case study of Partially Observable Markov Decision Processes (POMDP) for improving the User Equipment (UE) throughput and energy consumption, as they are imperative for IoS use cases, using Reinforcement Learning for astutely activating and deactivating the component carriers in carrier aggregation. Finally, we outline the challenges and open issues of IoS implementations and employing semantic communications, edge intelligence as well as learning under partial observability in the IoS context.
Abstract:Class-of-service (CoS) network traffic classification (NTC) classifies a group of similar traffic applications. The CoS classification is advantageous in resource scheduling for Internet service providers and avoids the necessity of remodelling. Our goal is to find a robust, lightweight, and fast-converging CoS classifier that uses fewer data in modelling and does not require specialized tools in feature extraction. The commonality of statistical features among the network flow segments motivates us to propose novel segmented learning that includes essential vector representation and a simple-segment method of classification. We represent the segmented traffic in the vector form using the EVR. Then, the segmented traffic is modelled for classification using random forest. Our solution's success relies on finding the optimal segment size and a minimum number of segments required in modelling. The solution is validated on multiple datasets for various CoS services, including virtual reality (VR). Significant findings of the research work are i) Synchronous services that require acknowledgment and request to continue communication are classified with 99% accuracy, ii) Initial 1,000 packets in any session are good enough to model a CoS traffic for promising results, and we therefore can quickly deploy a CoS classifier, and iii) Test results remain consistent even when trained on one dataset and tested on a different dataset. In summary, our solution is the first to propose segmentation learning NTC that uses fewer features to classify most CoS traffic with an accuracy of 99%. The implementation of our solution is available on GitHub.
Abstract:Cloud virtual reality (VR) gaming traffic characteristics such as frame size, inter-arrival time, and latency need to be carefully studied as a first step toward scalable VR cloud service provisioning. To this end, in this paper we analyze the behavior of VR gaming traffic and Quality of Service (QoS) when VR rendering is conducted remotely in the cloud. We first build a VR testbed utilizing a cloud server, a commercial VR headset, and an off-the-shelf WiFi router. Using this testbed, we collect and process cloud VR gaming traffic data from different games under a number of network conditions and fixed and adaptive video encoding schemes. To analyze the application-level characteristics such as video frame size, frame inter-arrival time, frame loss and frame latency, we develop an interval threshold based identification method for video frames. Based on the frame identification results, we present two statistical models that capture the behaviour of the VR gaming video traffic. The models can be used by researchers and practitioners to generate VR traffic models for simulations and experiments - and are paramount in designing advanced radio resource management (RRM) and network optimization for cloud VR gaming services. To the best of the authors' knowledge, this is the first measurement study and analysis conducted using a commercial cloud VR gaming platform, and under both fixed and adaptive bitrate streaming. We make our VR traffic data-sets publicly available for further research by the community.