School of Aeronautical Engineering, Air Force Engineering University, Xi'an, China
Abstract:The Internet of Sounds (IoS) combines sound sensing, processing, and transmission techniques, enabling collaboration among diverse sound devices. To achieve perceptual quality of sound synchronization in the IoS, it is necessary to precisely synchronize three critical factors: sound quality, timing, and behavior control. However, conventional bit-oriented communication, which focuses on bit reproduction, may not be able to fulfill these synchronization requirements under dynamic channel conditions. One promising approach to address the synchronization challenges of the IoS is through the use of semantic communication (SC) that can capture and leverage the logical relationships in its source data. Consequently, in this paper, we propose an IoS-centric SC framework with a transceiver design. The designed encoder extracts semantic information from diverse sources and transmits it to IoS listeners. It can also distill important semantic information to reduce transmission latency for timing synchronization. At the receiver's end, the decoder employs context- and knowledge-based reasoning techniques to reconstruct and integrate sounds, which achieves sound quality synchronization across diverse communication environments. Moreover, by periodically sharing knowledge, SC models of IoS devices can be updated to optimize their synchronization behavior. Finally, we explore several open issues on mathematical models, resource allocation, and cross-layer protocols.
Abstract:Semantic communication (SemCom) has been a transformative paradigm, emphasizing the precise exchange of meaningful information over traditional bit-level transmissions. However, existing SemCom research, primarily centered on simplified scenarios like single-pair transmissions with direct wireless links, faces significant challenges when applied to real-world radio access networks (RANs). This article introduces a Semantic-aware Radio Access Network (S-RAN), offering a holistic systematic view of SemCom beyond single-pair transmissions. We begin by outlining the S-RAN architecture, introducing new physical components and logical functions along with key design challenges. We then present transceiver design for end-to-end transmission to overcome conventional SemCom transceiver limitations, including static channel conditions, oversimplified background knowledge models, and hardware constraints. Later, we delve into the discussion on radio resource management for multiple users, covering semantic channel modeling, performance metrics, resource management algorithms, and a case study, to elaborate distinctions from resource management for legacy RANs. Finally, we highlight open research challenges and potential solutions. The objective of this article is to serve as a basis for advancing SemCom research into practical wireless systems.
Abstract:Quickly and accurately predicting the flight trajectory of a blue army fighter in close-range air combat helps a red army fighter gain a dominant situation, which is the winning factor in later air combat. However,due to the high speed and even hypersonic capabilities of advanced fighters, the diversity of tactical maneuvers,and the instantaneous nature of situational transitions,it is difficult to meet the requirements of practical combat applications in terms of prediction accuracy.To improve prediction accuracy,this paper proposes a spatio-temporal graph attention network (ST-GAT) using encoding and decoding structures to predict the flight trajectory. The encoder adopts a parallel structure of Transformer and GAT branches embedded with the multi-head self-attention mechanism in each front end. The Transformer branch network is used to extract the temporal characteristics of historical trajectories and capture the impact of the fighter's historical state on future trajectories, while the GAT branch network is used to extract spatial features in historical trajectories and capture potential spatial correlations between fighters.Then we concatenate the outputs of the two branches into a new feature vector and input it into a decoder composed of a fully connected network to predict the future position coordinates of the blue army fighter.The computer simulation results show that the proposed network significantly improves the prediction accuracy of flight trajectories compared to the enhanced CNN-LSTM network (ECNN-LSTM), with improvements of 47% and 34% in both ADE and FDE indicators,providing strong support for subsequent autonomous combat missions.
Abstract:Task-oriented semantic communications (TSC) enhance radio resource efficiency by transmitting task-relevant semantic information. However, current research often overlooks the inherent semantic distinctions among encoded features. Due to unavoidable channel variations from time and frequency-selective fading, semantically sensitive feature units could be more susceptible to erroneous inference if corrupted by dynamic channels. Therefore, this letter introduces a unified channel-resilient TSC framework via information bottleneck. This framework complements existing TSC approaches by controlling information flow to capture fine-grained feature-level semantic robustness. Experiments on a case study for real-time subchannel allocation validate the framework's effectiveness.
Abstract:Privacy, scalability, and reliability are significant challenges in unmanned aerial vehicle (UAV) networks as distributed systems, especially when employing machine learning (ML) technologies with substantial data exchange. Recently, the application of federated learning (FL) to UAV networks has improved collaboration, privacy, resilience, and adaptability, making it a promising framework for UAV applications. However, implementing FL for UAV networks introduces drawbacks such as communication overhead, synchronization issues, scalability limitations, and resource constraints. To address these challenges, this paper presents the Blockchain-enabled Clustered and Scalable Federated Learning (BCS-FL) framework for UAV networks. This improves the decentralization, coordination, scalability, and efficiency of FL in large-scale UAV networks. The framework partitions UAV networks into separate clusters, coordinated by cluster head UAVs (CHs), to establish a connected graph. Clustering enables efficient coordination of updates to the ML model. Additionally, hybrid inter-cluster and intra-cluster model aggregation schemes generate the global model after each training round, improving collaboration and knowledge sharing among clusters. The numerical findings illustrate the achievement of convergence while also emphasizing the trade-offs between the effectiveness of training and communication efficiency.
Abstract:Generative artificial intelligence (GAI) has emerged as a rapidly burgeoning field demonstrating significant potential in creating diverse contents intelligently and automatically. To support such artificial intelligence-generated content (AIGC) services, future communication systems should fulfill much more stringent requirements (including data rate, throughput, latency, etc.) with limited yet precious spectrum resources. To tackle this challenge, semantic communication (SemCom), dramatically reducing resource consumption via extracting and transmitting semantics, has been deemed as a revolutionary communication scheme. The advanced GAI algorithms facilitate SemCom on sophisticated intelligence for model training, knowledge base construction and channel adaption. Furthermore, GAI algorithms also play an important role in the management of SemCom networks. In this survey, we first overview the basics of GAI and SemCom as well as the synergies of the two technologies. Especially, the GAI-driven SemCom framework is presented, where many GAI models for information creation, SemCom-enabled information transmission and information effectiveness for AIGC are discussed separately. We then delve into the GAI-driven SemCom network management involving with novel management layers, knowledge management, and resource allocation. Finally, we envision several promising use cases, i.e., autonomous driving, smart city, and the Metaverse for a more comprehensive exploration.
Abstract:Quick and automated earthquake-damaged building detection from post-event satellite imagery is crucial, yet it is challenging due to the scarcity of training data required to develop robust algorithms. This letter presents the first dataset dedicated to detecting earthquake-damaged buildings from post-event very high resolution (VHR) Synthetic Aperture Radar (SAR) and optical imagery. Utilizing open satellite imagery and annotations acquired after the 2023 Turkey-Syria earthquakes, we deliver a dataset of coregistered building footprints and satellite image patches of both SAR and optical data, encompassing more than four thousand buildings. The task of damaged building detection is formulated as a binary image classification problem, that can also be treated as an anomaly detection problem due to extreme class imbalance. We provide baseline methods and results to serve as references for comparison. Researchers can utilize this dataset to expedite algorithm development, facilitating the rapid detection of damaged buildings in response to future events. The dataset and codes together with detailed explanations are made publicly available at \url{https://github.com/ya0-sun/PostEQ-SARopt-BuildingDamage}.
Abstract:Generative AI applications are recently catering to a vast user base by creating diverse and high-quality AI-generated content (AIGC). With the proliferation of mobile devices and rapid growth of mobile traffic, providing ubiquitous access to high-quality AIGC services via wireless communication networks is becoming the future direction for AIGC products. However, it is challenging to provide optimal AIGC services in wireless networks with unstable channels, limited bandwidth resources, and unevenly distributed computational resources. To tackle these challenges, we propose a semantic communication (SemCom)-empowered AIGC (SemAIGC) generation and transmission framework, where only semantic information of the content rather than all the binary bits should be extracted and transmitted by using SemCom. Specifically, SemAIGC integrates diffusion-based models within the semantic encoder and decoder for efficient content generation and flexible adjustment of the computing workload of both transmitter and receiver. Meanwhile, we devise a resource-aware workload trade-off (ROOT) scheme into the SemAIGC framework to intelligently decide transmitter/receiver workload, thus adjusting the utilization of computational resource according to service requirements. Simulations verify the superiority of our proposed SemAIGC framework in terms of latency and content quality compared to conventional approaches.
Abstract:Crowdsourced platforms provide huge amounts of street-view images that contain valuable building information. This work addresses the challenges in applying Scene Text Recognition (STR) in crowdsourced street-view images for building attribute mapping. We use Flickr images, particularly examining texts on building facades. A Berlin Flickr dataset is created, and pre-trained STR models are used for text detection and recognition. Manual checking on a subset of STR-recognized images demonstrates high accuracy. We examined the correlation between STR results and building functions, and analysed instances where texts were recognized on residential buildings but not on commercial ones. Further investigation revealed significant challenges associated with this task, including small text regions in street-view images, the absence of ground truth labels, and mismatches in buildings in Flickr images and building footprints in OpenStreetMap (OSM). To develop city-wide mapping beyond urban hotspot locations, we suggest differentiating the scenarios where STR proves effective while developing appropriate algorithms or bringing in additional data for handling other cases. Furthermore, interdisciplinary collaboration should be undertaken to understand the motivation behind building photography and labeling. The STR-on-Flickr results are publicly available at https://github.com/ya0-sun/STR-Berlin.
Abstract:Urban land use structures impact local climate conditions of metropolitan areas. To shed light on the mechanism of local climate wrt. urban land use, we present a novel, data-driven deep learning architecture and pipeline, DeepLCZChange, to correlate airborne LiDAR data statistics with the Landsat 8 satellite's surface temperature product. A proof-of-concept numerical experiment utilizes corresponding remote sensing data for the city of New York to verify the cooling effect of urban forests.