Abstract:With the development of computer vision, 3D object detection has become increasingly important in many real-world applications. Limited by the computing power of sensor-side hardware, the detection task is sometimes deployed on remote computing devices or the cloud to execute complex algorithms, which brings massive data transmission overhead. In response, this paper proposes an optical flow-driven semantic communication framework for the stereo-vision 3D object detection task. The proposed framework fully exploits the dependence of stereo-vision 3D detection on semantic information in images and prioritizes the transmission of this semantic information to reduce total transmission data sizes while ensuring the detection accuracy. Specifically, we develop an optical flow-driven module to jointly extract and recover semantics from the left and right images to reduce the loss of the left-right photometric alignment semantic information and improve the accuracy of depth inference. Then, we design a 2D semantic extraction module to identify and extract semantic meaning around the objects to enhance the transmission of semantic information in the key areas. Finally, a fusion network is used to fuse the recovered semantics, and reconstruct the stereo-vision images for 3D detection. Simulation results show that the proposed method improves the detection accuracy by nearly 70% and outperforms the traditional method, especially for the low signal-to-noise ratio regime.
Abstract:Multi-target detection and communication with extremely large-scale antenna arrays (ELAAs) operating at high frequencies necessitate generating multiple beams. However, conventional algorithms are slow and computationally intensive. For instance, they can simulate a \num{200}-antenna system over two weeks, and the time complexity grows exponentially with the number of antennas. Thus, this letter explores an ultra-low-complex solution for a multi-user, multi-target integrated sensing and communication (ISAC) system equipped with an ELAA base station (BS). It maximizes the communication sum rate while meeting sensing beampattern gain targets and transmit power constraints. As this problem is non-convex, a Riemannian stochastic gradient descent-based augmented Lagrangian manifold optimization (SGALM) algorithm is developed, which searches on a manifold to ensure constraint compliance. The algorithm achieves ultra-low complexity and superior runtime performance compared to conventional algorithms. For example, it is \num{56} times faster than the standard benchmark for \num{257} BS antennas.
Abstract:In this paper, we investigate receiver design for high frequency (HF) skywave massive multiple-input multiple-output (MIMO) communications. We first establish a modified beam based channel model (BBCM) by performing uniform sampling for directional cosine with deterministic sampling interval, where the beam matrix is constructed using a phase-shifted discrete Fourier transform (DFT) matrix. Based on the modified BBCM, we propose a beam structured turbo receiver (BSTR) involving low-dimensional beam domain signal detection for grouped user terminals (UTs), which is proved to be asymptotically optimal in terms of minimizing mean-squared error (MSE). Moreover, we extend it to windowed BSTR by introducing a windowing approach for interference suppression and complexity reduction, and propose a well-designed energy-focusing window. We also present an efficient implementation of the windowed BSTR by exploiting the structure properties of the beam matrix and the beam domain channel sparsity. Simulation results validate the superior performance of the proposed receivers but with remarkably low complexity.
Abstract:Semantic communications have been explored to perform downstream intelligent tasks by extracting and transmitting essential information. In this paper, we introduce a large model-empowered streaming semantic communication system for speech translation across various languages, named LaSC-ST. Specifically, we devise an edge-device collaborative semantic communication architecture by offloading the intricate semantic extraction module to edge servers, thereby reducing the computational burden on local devices. To support multilingual speech translation, pre-trained large speech models are utilized to learn unified semantic features from speech in different languages, breaking the constraint of a single input language and enhancing the practicality of the LaSC-ST. Moreover, the input speech is sequentially streamed into the developed system as short speech segments, which enables low transmission latency without the degradation in speech translation quality. A novel dynamic speech segmentation algorithm is proposed to further minimize the transmission latency by adaptively adjusting the duration of speech segments. According to simulation results, the LaSC-ST provides more accurate speech translation and achieves streaming transmission with lower latency compared to existing non-streaming semantic communication systems.
Abstract:To enable high data rates and sensing resolutions, integrated sensing and communication (ISAC) networks leverage extremely large antenna arrays and high frequencies, extending the Rayleigh distance and making near-field (NF) spherical wave propagation dominant. This unlocks numerous spatial degrees of freedom, raising the challenge of optimizing them for communication and sensing tradeoffs. To this end, we propose a rate-splitting multiple access (RSMA)-based NF-ISAC transmit scheme utilizing hybrid digital-analog antennas. RSMA enhances interference management, while a variable number of dedicated sensing beams adds beamforming flexibility. The objective is to maximize the minimum communication rate while ensuring multi-target sensing performance by jointly optimizing receive filters, analog and digital beamformers, common rate allocation, and the sensing beam count. To address uncertainty in sensing beam allocation, a rank-zero solution reconstruction method demonstrates that dedicated sensing beams are unnecessary for NF multi-target detection. A penalty dual decomposition (PDD)-based double-loop algorithm is introduced, employing weighted minimum mean-squared error (WMMSE) and quadratic transforms to reformulate communication and sensing rates. Simulations reveal that the proposed scheme: 1) Achieves performance comparable to fully digital beamforming with fewer RF chains, (2) Maintains NF multi-target detection without compromising communication rates, and 3) Significantly outperforms space division multiple access (SDMA) and far-field ISAC systems.
Abstract:Wireless communications and sensing (WCS) establish the backbone of modern information exchange and environment perception. Typical applications range from mobile networks and the Internet of Things to radar and sensor grids. The incorporation of machine learning further expands WCS's boundaries, unlocking automated and high-quality data analytics, together with advisable and efficient decision-making. Despite transformative capabilities, wireless systems often face numerous uncertainties in design and operation, such as modeling errors due to incomplete physical knowledge, statistical errors arising from data scarcity, measurement errors caused by sensor imperfections, computational errors owing to resource limitation, and unpredictability of environmental evolution. Once ignored, these uncertainties can lead to severe outcomes, e.g., performance degradation, system untrustworthiness, inefficient resource utilization, and security vulnerabilities. As such, this article reviews mature and emerging architectural, computational, and operational countermeasures, encompassing uncertainty-aware designs of signals and systems (e.g., diversity, adaptivity, modularity), as well as uncertainty-aware modeling and computational frameworks (e.g., risk-informed optimization, robust signal processing, and trustworthy machine learning). Trade-offs to employ these methods, e.g., robustness vs optimality, are also highlighted.
Abstract:Future communication networks are expected to connect massive distributed artificial intelligence (AI). Exploiting aligned priori knowledge of AI pairs, it is promising to convert high-dimensional data transmission into highly-compressed semantic communications (SC). However, to accommodate the local data distribution and user preferences, AIs generally adapt to different domains, which fundamentally distorts the SC alignment. In this paper, we propose a zero-forget domain adaptation (ZFDA) framework to preserve SC alignment. To prevent the DA from changing substantial neural parameters of AI, we design sparse additive modifications (SAM) to the parameters, which can be efficiently stored and switched-off to restore the SC alignment. To optimize the SAM, we decouple it into tractable continuous variables and a binary mask, and then handle the binary mask by a score-based optimization. Experimental evaluations on a SC system for image transmissions validate that the proposed framework perfectly preserves the SC alignment with almost no loss of DA performance, even improved in some cases, at a cost of less than 1% of additional memory.
Abstract:Artificial intelligence (AI) provides an alternative way to design channel coding with affordable complexity. However, most existing studies can only learn codes for a given size and rate, typically defined by a fixed network architecture and a set of parameters. The support of multiple code rates is essential for conserving bandwidth under varying channel conditions while it is costly to store multiple AI models or parameter sets. In this article, we propose an auto-encoder (AE) based rate-compatible linear block codes (RC-LBCs). The coding process associated with AI or non-AI decoders and multiple puncturing patterns is optimized in a data-driven manner. The superior performance of the proposed AI-based RC-LBC is demonstrated through our numerical experiments.
Abstract:For high-speed train (HST) millimeter wave (mmWave) communications, the use of narrow beams with small beam coverage needs frequent beam switching, while wider beams with small beam gain leads to weaker mmWave signal strength. In this paper, we consider beam switching based beam design, which is formulated as an optimization problem aiming to minimize the number of switched beams within a predetermined railway range subject to that the receiving signal-to-noise ratio (RSNR) at the HST is no lower than a predetermined threshold. To solve this problem, we propose two sequential beam design schemes, both including two alternately-performed stages. In the first stage, given an updated beam coverage according to the railway range, we transform the problem into a feasibility problem and further convert it into a min-max optimization problem by relaxing the RSNR constraints into a penalty of the objective function. In the second stage, we evaluate the feasibility of the beamformer obtained from solving the min-max problem and determine the beam coverage accordingly. Simulation results show that compared to the first scheme, the second scheme can achieve 96.20\% reduction in computational complexity at the cost of only 0.0657\% performance degradation.
Abstract:As a fundamental technique in array signal processing, beamforming plays a crucial role in amplifying signals of interest while mitigating interference and noise. When uncertainties exist in the signal model or the data size of snapshots is limited, the performance of beamformers significantly degrades. In this article, we comprehensively study the conceptual system, theoretical analysis, and algorithmic design for robust beamforming. Particularly, four technical approaches for robust beamforming are discussed, including locally robust beamforming, globally robust beamforming, regularized beamforming, and Bayesian-nonparametric beamforming. In addition, we investigate the equivalence among the methods and suggest a unified robust beamforming framework. As an application example, we show that the resolution of robust beamformers for direction-of-arrival (DoA) estimation can be greatly refined by incorporating the characteristics of subspace methods.