Abstract:Integrated Sensing and Communications (ISAC) enables efficient spectrum utilization and reduces hardware costs for beyond 5G (B5G) and 6G networks, facilitating intelligent applications that require both high-performance communication and precise sensing capabilities. This survey provides a comprehensive review of the evolution of ISAC over the years. We examine the expansion of the spectrum across RF and optical ISAC, highlighting the role of advanced technologies, along with key challenges and synergies. We further discuss the advancements in network architecture from single-cell to multi-cell systems, emphasizing the integration of collaborative sensing and interference mitigation strategies. Moreover, we analyze the progress from single-modal to multi-modal sensing, with a focus on the integration of edge intelligence to enable real-time data processing, reduce latency, and enhance decision-making. Finally, we extensively review standardization efforts by 3GPP, IEEE, and ITU, examining the transition of ISAC-related technologies and their implications for the deployment of 6G networks.
Abstract:Communication-centric Integrated Sensing and Communication (ISAC) has been recognized as a promising methodology to implement wireless sensing functionality over existing network architectures, due to its cost-effectiveness and backward compatibility to legacy cellular systems. However, the inherent randomness of the communication signal may incur huge fluctuations in sensing capabilities, leading to unfavorable detection and estimation performance. To address this issue, we elaborate on random ISAC signal processing methods in this article, aiming at improving the sensing performance without unduly deteriorating the communication functionality. Specifically, we commence by discussing the fundamentals of sensing with random communication signals, including the performance metrics and optimal ranging waveforms. Building on these concepts, we then present a general framework for random ISAC signal transmission, followed by an in-depth exploration of time-domain pulse shaping, frequency-domain constellation shaping, and spatial-domain precoding methods. We provide a comprehensive overview of each of these topics, including models, results, and design guidelines. Finally, we conclude this article by identifying several promising research directions for random ISAC signal transmission.
Abstract:In recent years, UWB has garnered widespread attention in academia and industry due to its low power consumption, wide bandwidth, and high time resolution characteristics. This paper introduces the design of an asynchronous IR-UWB integrated communication and localization (ICL) downlink network, which employs unified waveforms to enable simultaneous data transmission and localization. A differential sequential detection strategy has been proposed for data demodulation. To address errors caused by symbol misalignment, a novel symbol confidence metric model is introduced to ensure reliable pulse detection and time-of-arrival (TOA) estimation. Additionally, an asynchronous start-of-frame delimiter (SFD) detection model has been constructed to guide parameter optimization for practical applications. Furthermore, the clock drift estimation has been improved by leveraging the confidence metric within a modified weighted least squares (MWLS) framework. Simulation results demonstrate that the proposed system achieves reliable clock drift estimation, communication, and self-localization simultaneously. The operational range of the confidence metric required for these outcomes is also quantified, providing valuable insights for parameter design and system implementation. Finally, the agent localization accuracy can be achieved within 10 cm at over 90\% confidence, with commercial UWB devices according to practical measurements.
Abstract:Tool learning has emerged as a promising direction by extending Large Language Models' (LLMs) capabilities with external tools. Existing tool learning studies primarily focus on the general-purpose tool-use capability, which addresses explicit user requirements in instructions. However, they overlook the importance of personalized tool-use capability, leading to an inability to handle implicit user preferences. To address the limitation, we first formulate the task of personalized tool learning, which integrates user's interaction history towards personalized tool usage. To fill the gap of missing benchmarks, we construct PEToolBench, featuring diverse user preferences reflected in interaction history under three distinct personalized settings, and encompassing a wide range of tool-use scenarios. Moreover, we propose a framework PEToolLLaMA to adapt LLMs to the personalized tool learning task, which is trained through supervised fine-tuning and direct preference optimization. Extensive experiments on PEToolBench demonstrate the superiority of PEToolLLaMA over existing LLMs.
Abstract:With the advancement of large language models (LLMs), solving complex reasoning tasks has gained increasing attention. Inference-time computation methods (e.g., Best-of-N, beam search, et al.) are particularly valuable as they can enhance reasoning performance without modifying model parameters or requiring additional training. However, these techniques come with implementation challenges, and most existing methods remain at the proof-of-concept stage with limited practical adoption due to their computational complexity and varying effectiveness across different tasks. In this paper, we investigate and benchmark diverse inference-time computation strategies across reasoning tasks of varying complexity. Since most current methods rely on a proposer-verifier pipeline that first generates candidate solutions (e.g., reasoning solutions) and then selects the best one based on reward signals (e.g., RLHF rewards, process rewards), our research focuses on optimizing both candidate solution generation (e.g., instructing prompts, hyperparameters such as temperature and top-p) and reward mechanisms (e.g., self-evaluation, reward types). Through extensive experiments (more than 20,000 A100-80G GPU hours with over 1,000 experiments) across a variety of models (e.g., Llama, Qwen, and Mistral families) of various sizes, our ablation studies reveal that previously overlooked strategies can significantly enhance performance (e.g., tuning temperature can improve reasoning task performance by up to 5%). Furthermore, we establish a standardized benchmark for inference-time computation by systematically evaluating six representative methods across eight reasoning tasks. These findings provide a stronger foundation for future research. The code is available at https://github.com/usail-hkust/benchmark_inference_time_computation_LL
Abstract:Next-generation wireless networks are conceived to provide reliable and high-data-rate communication services for diverse scenarios, such as vehicle-to-vehicle, unmanned aerial vehicles, and satellite networks. The severe Doppler spreads in the underlying time-varying channels induce destructive inter-carrier interference (ICI) in the extensively adopted orthogonal frequency division multiplexing (OFDM) waveform, leading to severe performance degradation. This calls for a new air interface design that can accommodate the severe delay-Doppler spreads in highly dynamic channels while possessing sufficient flexibility to cater to various applications. This article provides a comprehensive overview of a promising chirp-based waveform named affine frequency division multiplexing (AFDM). It is featured with two tunable parameters and achieves optimal diversity order in doubly dispersive channels (DDC). We study the fundamental principle of AFDM, illustrating its intrinsic suitability for DDC. Based on that, several potential applications of AFDM are explored. Furthermore, the major challenges and the corresponding solutions of AFDM are presented, followed by several future research directions. Finally, we draw some instructive conclusions about AFDM, hoping to provide useful inspiration for its development.
Abstract:This paper proposes an integrated sensing and communications (ISAC) system based on affine frequency division multiplexing (AFDM) waveform. To this end, a metric set is designed according to not only the maximum tolerable delay/Doppler, but also the weighted spectral efficiency as well as the outage/error probability of sensing and communications. This enables the analytical investigation of the performance trade-offs of AFDM-ISAC system using the derived analytical relation among metrics and AFDM waveform parameters. Moreover, by revealing that delay and the integral/fractional parts of normalized Doppler can be decoupled in the affine Fourier transform-Doppler domain, an efficient estimation method is proposed for our AFDM-ISAC system, whose unambiguous Doppler can break through the limitation of subcarrier spacing. Theoretical analyses and numerical results verify that our proposed AFDM-ISAC system may significantly enlarge unambiguous delay/Doppler while possessing good spectral efficiency and peak-to-sidelobe level ratio in high-mobility scenarios.
Abstract:6G communications systems are expected to integrate radar-like sensing capabilities enabling novel use cases. However, integrated sensing and communications (ISAC) introduces a trade-off between communications and sensing performance because the optimal constellations for each task differ. In this paper, we compare geometric, probabilistic and joint constellation shaping for orthogonal frequency division multiplexing (OFDM)-ISAC systems using an autoencoder (AE) framework. We first derive the constellation-dependent detection probability and propose a novel loss function to include the sensing performance in the AE framework. Our simulation results demonstrate that constellation shaping enables a dynamic trade-off between communications and sensing. Depending on whether sensing or communications performance is prioritized, geometric or probabilistic constellation shaping is preferred. Joint constellation shaping combines the advantages of geometric and probabilistic shaping, significantly outperforming legacy modulation formats.
Abstract:This paper investigates the problem of computing capacity-cost (C-C) functions for continuous channels. Motivated by the Kullback-Leibler divergence (KLD) proximal reformulation of the classical Blahut-Arimoto (BA) algorithm, the Wasserstein distance is introduced to the proximal term for the continuous case, resulting in an iterative algorithm related to the Wasserstein gradient descent. Practical implementation involves moving particles along the negative gradient direction of the objective function's first variation in the Wasserstein space and approximating integrals by the importance sampling (IS) technique. Such formulation is also applied to the rate-distortion (R-D) function for continuous source spaces and thus provides a unified computation framework for both problems.
Abstract:Multimodal information (e.g., visual, acoustic, and textual) has been widely used to enhance representation learning for micro-video recommendation. For integrating multimodal information into a joint representation of micro-video, multimodal fusion plays a vital role in the existing micro-video recommendation approaches. However, the static multimodal fusion used in previous studies is insufficient to model the various relationships among multimodal information of different micro-videos. In this paper, we develop a novel meta-learning-based multimodal fusion framework called Meta Multimodal Fusion (MetaMMF), which dynamically assigns parameters to the multimodal fusion function for each micro-video during its representation learning. Specifically, MetaMMF regards the multimodal fusion of each micro-video as an independent task. Based on the meta information extracted from the multimodal features of the input task, MetaMMF parameterizes a neural network as the item-specific fusion function via a meta learner. We perform extensive experiments on three benchmark datasets, demonstrating the significant improvements over several state-of-the-art multimodal recommendation models, like MMGCN, LATTICE, and InvRL. Furthermore, we lighten our model by adopting canonical polyadic decomposition to improve the training efficiency, and validate its effectiveness through experimental results. Codes are available at https://github.com/hanliu95/MetaMMF.