Abstract:Terahertz (THz) communications, with their substantial bandwidth, are essential for meeting the ultra-high data rate demands of emerging high-mobility scenarios such as vehicular-to-everything (V2X) networks. In these contexts, beamwidth adaptation has been explored to address the problem that high-mobility targets frequently move out of the narrow THz beam range. However, existing approaches cannot effectively track targets due to a lack of real-time motion awareness. Consequently, we propose a sensing-assisted beam tracking scheme with real-time beamwidth adaptation. Specifically, the base station (BS) periodically collects prior sensing information to predict the target's motion path by applying a particular motion model. Then, we build a pre-calculated codebook by optimising precoders to align the beamwidth with various predicted target paths, thereby maximising the average achievable data rates within each sensing period. Finally, the BS selects the optimal precoder from the codebook to maintain stable and continuous connectivity. Simulation results show that the proposed scheme significantly improves the rate performance and reduces outage probability compared to existing approaches under various target mobility.
Abstract:Relying on the powerful communication capabilities and rapidly changing geometric configuration, the Low Earth Orbit (LEO) satellites have the potential to offer integrated communication and navigation (ICAN) services. However, the isolated resource utilization in the traditional satellite communication and navigation systems has led to a compromised system performance. Against this backdrop, this paper formulates a joint beamforming design and satellite selection optimization problem for the LEO-ICAN network to maximize the sum rate, while simultaneously reconciling the positioning performance. A two-layer algorithm is proposed, where the beamforming design in the inner layer is solved by the difference-of-convex programming method to maximize the sum rate, and the satellite selection in the outer layer is modeled as a coalition formation game to simultaneously reconcile the positioning performance. Simulation results verify the superiority of our proposed algorithms by increasing the sum rate by 16.6% and 29.3% compared with the conventional beamforming and satellite selection schemes, respectively.
Abstract:Integrated sensing and communication (ISAC) has been recognized as a key enabler and feature of future wireless networks. In the existing works analyzing the performances of ISAC, discrete-time systems were commonly assumed, which, however, overlooked the impacts of temporal, spectral, and spatial properties. To address this issue, we establish a unified information model for the band-limited continuous-time ISAC systems. In the established information model, we employ a novel sensing performance metric, called the sensing mutual information (SMI). Through analysis, we show how the SMI can be utilized as a bridge between the mutual information domain and the mean squared error (MSE) domain. In addition, we illustrate the communication mutual information (CMI)-SMI and CMI-MSE regions to identify the performance bounds of ISAC systems in practical settings and reveal the trade-off between communication and sensing performances. Moreover, via analysis and numerical results, we provide two valuable insights into the design of novel ISAC-enabled systems: i) communication prefers the waveforms of random amplitude, sensing prefers the waveforms of constant amplitude, both communication and sensing favor the waveforms of low correlations with random phases; ii) There exists a linear positive proportional relationship between the allocated time-frequency resource and the achieved communication rate/sensing MSE.
Abstract:Intelligent reflecting surface (IRS) has the potential to enhance sensing performance, due to its capability of reshaping the echo signals. Different from the existing literature, which has commonly focused on IRS beamforming optimization, in this paper, we pay special attention to designing effective signal processing approaches to extract sensing information from IRS-reshaped echo signals. To this end, we investigate an IRS-assisted non-line-of-sight (NLOS) target detection and multi-parameter estimation problem in orthogonal frequency division multiplexing (OFDM) systems. To address this problem, we first propose a novel detection and direction estimation framework, including a low-overhead hierarchical codebook that allows the IRS to generate three-dimensional beams with adjustable beam direction and width, a delay spectrum peak-based beam training scheme for detection and direction estimation, and a beam refinement scheme for further enhancing the accuracy of the direction estimation. Then, we propose a target range and velocity estimation scheme by extracting the delay-Doppler information from the IRS-reshaped echo signals. Numerical results demonstrate that the proposed schemes can achieve 99.7% target detection rate, a 10^{-3}-rad level direction estimation accuracy, and a 10^{-6}-m/10^{-5}-m/s level range/velocity estimation accuracy.
Abstract:Terahertz (THz) communication is considered one of the most critical technologies for 6G because of its abundant bandwidth. To compensate the high propagation of THz, analog/digital hybrid precoding for THz massive multiple input multiple output (MIMO) is proposed to focus signals and extend communication range. Notably, considering hardware cost and power consumption, infinite and high-resolution phase shifters (PSs) are difficult to implement in THz massive MIMO and low-resolution PSs are typically adopted in practice. However, low-resolution PSs cause severe performance degradation. Moreover, the beam squint in wideband THz massive MIMO increases the performance degradation because of the frequency independence of the analog PSs. Motivated by the above factors, in this paper, we firstly propose a heuristic algorithm under fully connected (FC) structure, which optimize the digital precoder and the analog precoder alternately. Then we migrate the proposed heuristic algorithm to the partially-connected (PC) architecture. To further improve the performance, we extend our design to dynamic subarrays in which each RF chain is connected to any antenna that does not duplicate. The numerical results demonstrate that our proposed wideband hybrid precoding with low-resolution PSs achieves better performance to the comparisons for both FC structure and PC structure.
Abstract:An increasing number of models have achieved great performance in remote sensing tasks with the recent development of Large Language Models (LLMs) and Visual Language Models (VLMs). However, these models are constrained to basic vision and language instruction-tuning tasks, facing challenges in complex remote sensing applications. Additionally, these models lack specialized expertise in professional domains. To address these limitations, we propose a LLM-driven remote sensing intelligent agent named RS-Agent. Firstly, RS-Agent is powered by a large language model (LLM) that acts as its "Central Controller," enabling it to understand and respond to various problems intelligently. Secondly, our RS-Agent integrates many high-performance remote sensing image processing tools, facilitating multi-tool and multi-turn conversations. Thirdly, our RS-Agent can answer professional questions by leveraging robust knowledge documents. We conducted experiments using several datasets, e.g., RSSDIVCS, RSVQA, and DOTAv1. The experimental results demonstrate that our RS-Agent delivers outstanding performance in many tasks, i.e., scene classification, visual question answering, and object counting tasks.
Abstract:The application of unmanned aerial vehicles (UAV) has been widely extended recently. It is crucial to ensure accurate latitude and longitude coordinates for UAVs, especially when the global navigation satellite systems (GNSS) are disrupted and unreliable. Existing visual localization methods achieve autonomous visual localization without error accumulation by matching the ground-down view image of UAV with the ortho satellite maps. However, collecting UAV ground-down view images across diverse locations is costly, leading to a scarcity of large-scale datasets for real-world scenarios. Existing datasets for UAV visual localization are often limited to small geographic areas or are focused only on urban regions with distinct textures. To address this, we define the UAV visual localization task by determining the UAV's real position coordinates on a large-scale satellite map based on the captured ground-down view. In this paper, we present a large-scale dataset, UAV-VisLoc, to facilitate the UAV visual localization task. This dataset comprises images from diverse drones across 11 locations in China, capturing a range of topographical features. The dataset features images from fixed-wing drones and multi-terrain drones, captured at different altitudes and orientations. Our dataset includes 6,742 drone images and 11 satellite maps, with metadata such as latitude, longitude, altitude, and capture date. Our dataset is tailored to support both the training and testing of models by providing a diverse and extensive data.
Abstract:Cooperative positioning with multiple low earth orbit (LEO) satellites is promising in providing location-based services and enhancing satellite-terrestrial communication. However, positioning accuracy is greatly affected by inter-beam interference and satellite-terrestrial topology geometry. To select the best combination of satellites from visible ones and suppress inter-beam interference, this paper explores the utilization of flexible beam scheduling and beamforming of multi-beam LEO satellites that can adjust beam directions toward the same earth-fixed cell to send positioning signals simultaneously. By leveraging Cram\'{e}r-Rao lower bound (CRLB) to characterize user Time Difference of Arrival (TDOA) positioning accuracy, the concerned problem is formulated, aiming at optimizing user positioning accuracy under beam scheduling and beam transmission power constraints. To deal with the mixed-integer-nonconvex problem, we decompose it into an inner beamforming design problem and an outer beam scheduling problem. For the former, we first prove the monotonic relationship between user positioning accuracy and its perceived signal-to-interference-plus-noise ratio (SINR) to reformulate the problem, and then semidefinite relaxation (SDR) is adopted for beamforming design. For the outer problem, a heuristic low-complexity beam scheduling scheme is proposed, whose core idea is to schedule users with lower channel correlation to mitigate inter-beam interference while seeking a proper satellite-terrestrial topology geometry. Simulation results verify the superior positioning performance of our proposed positioning-oriented beamforming and beam scheduling scheme, and it is shown that average user positioning accuracy is improved by $17.1\%$ and $55.9\%$ when the beam transmission power is 20 dBw, compared to conventional beamforming and beam scheduling schemes, respectively.
Abstract:Communication-sensing integration represents an up-and-coming area of research, enabling wireless networks to simultaneously perform communication and sensing tasks. However, in urban cellular networks, the blockage of buildings results in a complex signal propagation environment, affecting the performance analysis of integrated sensing and communication (ISAC) networks. To overcome this obstacle, this paper constructs a comprehensive framework considering building blockage and employs a distance-correlated blockage model to analyze interference from line of sight (LoS), non-line of sight (NLoS), and target reflection cascading (TRC) links. Using stochastic geometric theory, expressions for signal-to-interference-plus-noise ratio (SINR) and coverage probability for communication and sensing in the presence of blockage are derived, allowing for a comprehensive comparison under the same parameters. The research findings indicate that blockage can positively impact coverage, especially in enhancing communication performance. The analysis also suggests that there exists an optimal base station (BS) density when blockage is of the same order of magnitude as the BS density, maximizing communication or sensing coverage probability.
Abstract:Deep neural networks have achieved promising progress in remote sensing (RS) image classification, for which the training process requires abundant samples for each class. However, it is time-consuming and unrealistic to annotate labels for each RS category, given the fact that the RS target database is increasing dynamically. Zero-shot learning (ZSL) allows for identifying novel classes that are not seen during training, which provides a promising solution for the aforementioned problem. However, previous ZSL models mainly depend on manually-labeled attributes or word embeddings extracted from language models to transfer knowledge from seen classes to novel classes. Besides, pioneer ZSL models use convolutional neural networks pre-trained on ImageNet, which focus on the main objects appearing in each image, neglecting the background context that also matters in RS scene classification. To address the above problems, we propose to collect visually detectable attributes automatically. We predict attributes for each class by depicting the semantic-visual similarity between attributes and images. In this way, the attribute annotation process is accomplished by machine instead of human as in other methods. Moreover, we propose a Deep Semantic-Visual Alignment (DSVA) that take advantage of the self-attention mechanism in the transformer to associate local image regions together, integrating the background context information for prediction. The DSVA model further utilizes the attribute attention maps to focus on the informative image regions that are essential for knowledge transfer in ZSL, and maps the visual images into attribute space to perform ZSL classification. With extensive experiments, we show that our model outperforms other state-of-the-art models by a large margin on a challenging large-scale RS scene classification benchmark.