Abstract:In this paper, a novel multi-modal intelligent channel model for sixth-generation (6G) multiple-unmanned aerial vehicle (multi-UAV)-to-multi-vehicle communications is proposed. To thoroughly explore the mapping relationship between the physical environment and the electromagnetic space in the complex multi-UAV-to-multi-vehicle scenario, two new parameters, i.e., terrestrial traffic density (TTD) and aerial traffic density (ATD), are developed and a new sensing-communication intelligent integrated dataset is constructed in suburban scenario under different TTD and ATD conditions. With the aid of sensing data, i.e., light detection and ranging (LiDAR) point clouds, the parameters of static scatterers, terrestrial dynamic scatterers, and aerial dynamic scatterers in the electromagnetic space, e.g., number, distance, angle, and power, are quantified under different TTD and ATD conditions in the physical environment. In the proposed model, the channel non-stationarity and consistency on the time and space domains and the channel non-stationarity on the frequency domain are simultaneously mimicked. The channel statistical properties, such as time-space-frequency correlation function (TSF-CF), time stationary interval (TSI), and Doppler power spectral density (DPSD), are derived and simulated. Simulation results match ray-tracing (RT) results well, which verifies the accuracy of the proposed multi-UAV-to-multi-vehicle channel model.
Abstract:Given the importance of datasets for sensing-communication integration research, a novel simulation platform for constructing communication and multi-modal sensory dataset is developed. The developed platform integrates three high-precision software, i.e., AirSim, WaveFarer, and Wireless InSite, and further achieves in-depth integration and precise alignment of them. Based on the developed platform, a new synthetic intelligent multi-modal sensing-communication dataset for Synesthesia of Machines (SoM), named SynthSoM, is proposed. The SynthSoM dataset contains various air-ground multi-link cooperative scenarios with comprehensive conditions, including multiple weather conditions, times of the day, intelligent agent densities, frequency bands, and antenna types. The SynthSoM dataset encompasses multiple data modalities, including radio-frequency (RF) channel large-scale and small-scale fading data, RF millimeter wave (mmWave) radar sensory data, and non-RF sensory data, e.g., RGB images, depth maps, and light detection and ranging (LiDAR) point clouds. The quality of SynthSoM dataset is validated via statistics-based qualitative inspection and evaluation metrics through machine learning (ML) via real-world measurements. The SynthSoM dataset is open-sourced and provides consistent data for cross-comparing SoM-related algorithms.
Abstract:This paper proposes a novel sixth-generation (6G) multi-modal intelligent vehicle-to-vehicle (V2V) channel model from light detection and ranging (LiDAR) point clouds based on Synesthesia of Machines (SoM). To explore the mapping relationship between physical environment and electromagnetic space, a new V2V high-fidelity mixed sensing-communication integration simulation dataset with different vehicular traffic densities (VTDs) is constructed. Based on the constructed dataset, a novel scatterer recognition (ScaR) algorithm utilizing neural network SegNet is developed to recognize scatterer spatial attributes from LiDAR point clouds via SoM. In the developed ScaR algorithm, the mapping relationship between LiDAR point clouds and scatterers is explored, where the distribution of scatterers is obtained in the form of grid maps. Furthermore, scatterers are distinguished into dynamic and static scatterers based on LiDAR point cloud features, where parameters, e.g., distance, angle, and number, related to scatterers are determined. Through ScaR, dynamic and static scatterers change with the variation of LiDAR point clouds over time, which precisely models channel non-stationarity and consistency under different VTDs. Some important channel statistical properties, such as time-frequency correlation function (TF-CF) and Doppler power spectral density (DPSD), are obtained. Simulation results match well with ray-tracing (RT)-based results, thus demonstrating the necessity of exploring the mapping relationship and the utility of the proposed model.
Abstract:Low-rank adaptation (LoRA) reduces the computational and memory demands of fine-tuning large language models (LLMs) by approximating updates with low-rank matrices. However, low-rank approximation in two-dimensional space fails to capture high-dimensional structures within the target matrix. Recently, tensor decomposition methods have been explored for fine-tuning LLMs, leveraging their ability to extract structured information. Yet, these approaches primarily rely on random initialization, and the impact of initialization on tensor adaptation remains underexplored. In this paper, we reveal that random initialization significantly diverges from the validation loss achieved by full fine-tuning. To address this, we propose Weight-Decomposed Tensor Adaptation (DoTA), which leverages the Matrix Product Operator (MPO) decomposition of pre-trained weights for effective initialization in fine-tuning LLMs. Additionally, we introduce QDoTA, a quantized version of DoTA designed for 4-bit quantization. Experiments on commonsense and arithmetic reasoning tasks show that DoTA outperforms random initialization methods with fewer parameters. QDoTA further reduces memory consumption and achieves comparable performance to DoTA on commonsense reasoning tasks. We will release our code to support future research.
Abstract:Unstructured text data annotation and analysis are fundamental to management research, often relying on human annotators through crowdsourcing platforms. While Large Language Models (LLMs) promise to provide a cost-effective and efficient alternative to human annotation, there lacks a systematic workflow that evaluate when LLMs are suitable or how to proceed with LLM-based text annotation in a reproducible manner. This paper addresses this methodological gap by introducing the ``SILICON" (\textbf{S}ystematic \textbf{I}nference with \textbf{L}LMs for \textbf{I}nformation \textbf{C}lassificati\textbf{o}n and \textbf{N}otation) workflow. The workflow integrates established principles of human annotation with systematic prompt optimization and model selection, addressing challenges such as developing robust annotation guidelines, establishing high-quality human baselines, optimizing prompts, and ensuring reproducibility across LLMs. We validate the SILICON workflow through seven case studies covering common management research tasks, including business proposal evaluation, dialog intent and breakdown analysis, review attribute detection. Our findings highlight the importance of validating annotation guideline agreement, the superiority of expert-developed human baselines over crowdsourced ones, the iterative nature of prompt optimization, and the necessity of testing multiple LLMs. Notably, we propose a regression-based methodology to empirically compare LLM outputs across prompts and models. Our workflow advances management research by establishing reproducible processes for LLM-based annotation that maintain scientific rigor. We provide practical guidance for researchers to effectively navigate the evolving landscape of generative AI tools effectively while maintaining transparency and reproducibility.
Abstract:The potential benefits of integrated sensing and communication (ISAC) are anticipated to play a significant role in future sub-terahertz (sub-THz) systems. However, the beam squint effect is pronounced in sub-THz systems, expanding coverage areas while severely degrading communication performance. Existing hybrid precoding designs struggle to balance both functionalities in the presence of beam squint, limiting the performance gain achievable through ISAC. To address this challenge, we propose two squint-aware hybrid precoding schemes for sub-THz systems that proactively regulate the correlation between communication and sensing channels, leveraging the inherent degrees of freedom in the hardware to enhance integrated gain. We introduce a squint-aware optimization-based hybrid precoding algorithm (SA-Opt) and develop an unsupervised learning-assisted complex-valued squint-aware network (CSP-Net) to reduce complexity, tailoring its architecture to the specific data and task characteristics. The effectiveness of the proposed schemes is demonstrated through simulations.
Abstract:Channel prediction permits to acquire channel state information (CSI) without signaling overhead. However, almost all existing channel prediction methods necessitate the deployment of a dedicated model to accommodate a specific configuration. Leveraging the powerful modeling and multi-task learning capabilities of foundation models, we propose the first space-time-frequency (STF) wireless foundation model (WiFo) to address time-frequency channel prediction tasks in a one-for-all manner. Specifically, WiFo is initially pre-trained over massive and extensive diverse CSI datasets. Then, the model will be instantly used for channel prediction under various CSI configurations without any fine-tuning. We propose a masked autoencoder (MAE)-based network structure for WiFo to handle heterogeneous STF CSI data, and design several mask reconstruction tasks for self-supervised pre-training to capture the inherent 3D variations of CSI. To fully unleash its predictive power, we build a large-scale heterogeneous simulated CSI dataset consisting of 160K CSI samples for pre-training. Simulations validate its superior unified learning performance across multiple datasets and demonstrate its state-of-the-art (SOTA) zero-shot generalization performance via comparisons with other full-shot baselines.
Abstract:In this paper, we propose a novel dependency-aware task scheduling strategy for dynamic unmanned aerial vehicle-assisted connected autonomous vehicles (CAVs). Specifically, different computation tasks of CAVs consisting of multiple dependency subtasks are judiciously assigned to nearby CAVs or the base station for promptly completing tasks. Therefore, we formulate a joint scheduling priority and subtask assignment optimization problem with the objective of minimizing the average task completion time. The problem aims at improving the long-term system performance, which is reformulated as a Markov decision process. To solve the problem, we further propose a diffusion-based reinforcement learning algorithm, named Synthetic DDQN based Subtasks Scheduling, which can make adaptive task scheduling decision in real time. A diffusion model-based synthetic experience replay is integrated into the reinforcement learning framework, which can generate sufficient synthetic data in experience replay buffer, thereby significantly accelerating convergence and improving sample efficiency. Simulation results demonstrate the effectiveness of the proposed algorithm on reducing task completion time, comparing to benchmark schemes.
Abstract:In the future sixth-generation (6G) era, to support accurate localization sensing and efficient communication link establishment for intelligent agents, a comprehensive understanding of the surrounding environment and proper channel modeling are indispensable. The existing method, which solely exploits radio frequency (RF) communication information, is difficult to accomplish accurate channel modeling. Fortunately, multi-modal devices are deployed on intelligent agents to obtain environmental features, which could further assist in channel modeling. Currently, some research efforts have been devoted to utilizing multi-modal information to facilitate channel modeling, while still lack a comprehensive review. To fill this gap, we embark on an initial endeavor with the goal of reviewing multi-modal intelligent channel modeling (MMICM) via Synesthesia of Machines (SoM). Compared to channel modeling approaches that solely utilize RF communication information, the utilization of multi-modal information can provide a more in-depth understanding of the propagation environment around the transceiver, thus facilitating more accurate channel modeling. First, this paper introduces existing channel modeling approaches from the perspective of the channel modeling evolution. Then, we have elaborated and investigated recent advances in the topic of capturing typical channel characteristics and features, i.e., channel non-stationarity and consistency, by characterizing the mathematical, spatial, coupling, and mapping relationships. In addition, applications that can be supported by MMICM are summarized and analyzed. To corroborate the superiority of MMICM via SoM, we give the simulation result and analysis. Finally, some open issues and potential directions for the MMICM are outlined from the perspectives of measurements, modeling, and applications.
Abstract:We show theoretically and empirically that the linear Transformer, when applied to graph data, can implement algorithms that solve canonical problems such as electric flow and eigenvector decomposition. The input to the Transformer is simply the graph incidence matrix; no other explicit positional encoding information is provided. We present explicit weight configurations for implementing each such graph algorithm, and we bound the errors of the constructed Transformers by the errors of the underlying algorithms. Our theoretical findings are corroborated by experiments on synthetic data. Additionally, on a real-world molecular regression task, we observe that the linear Transformer is capable of learning a more effective positional encoding than the default one based on Laplacian eigenvectors. Our work is an initial step towards elucidating the inner-workings of the Transformer for graph data.