Abstract:A DeepCAPA (Deep Learning for Continuous Aperture Array (CAPA)) framework is proposed to learn beamforming in CAPA systems. The beamforming optimization problem is firstly formulated, and it is mathematically proved that the optimal beamforming lies in the subspace spanned by users' conjugate channel responses. Two challenges are encountered when directly applying deep neural networks (DNNs) for solving the formulated problem, i) both the input and output spaces are infinite-dimensional, which are not compatible with DNNs. The finite-dimensional representations of inputs and outputs are derived to address this challenge. ii) A closed-form loss function is unavailable for training the DNN. To tackle this challenge, two additional DNNs are trained to approximate the operations without closed-form expressions for expediting gradient back-propagation. To improve learning performance and reduce training complexity, the permutation equivariance properties of the mappings to be learned are mathematically proved. As a further advance, the DNNs are designed as graph neural networks to leverage the properties. Numerical results demonstrate that: i) the proposed DeepCAPA framework achieves higher spectral efficiency and lower inference complexity compared to match-filtering and state-of-art Fourier-based discretization method, and ii) DeepCAPA approaches the performance upper bound of optimizing beamforming in the spatially discrete array-based system as the number of antennas in a fixed-sized area tends toward infinity.
Abstract:Recent advancements in large language models (LLMs) have driven a revolutionary paradigm shift in process automation from Robotic Process Automation to Agentic Process Automation by automating the workflow orchestration procedure based on LLMs. However, existing LLMs (even the advanced OpenAI GPT-4o) are confined to achieving satisfactory capability in workflow orchestration. To address this limitation, we present WorkflowLLM, a data-centric framework elaborately designed to enhance the capability of LLMs in workflow orchestration. It first constructs a large-scale fine-tuning dataset WorkflowBench with 106,763 samples, covering 1,503 APIs from 83 applications across 28 categories. Specifically, the construction process can be divided into three phases: (1) Data Collection: we collect real-world workflow data from Apple Shortcuts and RoutineHub, transcribing them into Python-style code. We further equip them with generated hierarchical thought via ChatGPT. (2) Query Expansion: we prompt ChatGPT to generate more task queries to enrich the diversity and complexity of workflows. (3) Workflow Generation: we leverage an annotator model trained on collected data to generate workflows for synthesized queries. Finally, we merge the synthetic samples that pass quality confirmation with the collected samples to obtain the WorkflowBench. Based on WorkflowBench, we fine-tune Llama-3.1-8B to obtain WorkflowLlama. Our experiments show that WorkflowLlama demonstrates a strong capacity to orchestrate complex workflows, while also achieving notable generalization performance on previously unseen APIs. Additionally, WorkflowBench exhibits robust zero-shot generalization capabilities on an out-of-distribution task planning dataset, T-Eval. Our data and code are available at https://github.com/OpenBMB/WorkflowLLM.
Abstract:The beamforming optimization in continuous aperture array (CAPA)-based multi-user communications is studied. In contrast to conventional spatially discrete antenna arrays, CAPAs can exploit the full spatial degrees of freedoms (DoFs) by emitting information-bearing electromagnetic (EM) wave through continuous source current distributed across the aperture. Nevertheless, such operation renders the beamforming optimization problem as a non-convex integral-based functional programming problem, which is challenging for conventional discrete optimization methods. A couple of low-complexity approaches are proposed to solve the functional programming problem. 1) Calculus of variations (CoV)-based approach: Closed-form structure of the optimal continuous source patterns are derived based on CoV, inspiring a low-complexity integral-free iterative algorithm for solving the functional programming problem. 2) Correlation-based zero-forcing (Corr-ZF) approach: Closed-form ZF source current patterns that completely eliminate the interuser interference are derived based on the channel correlations. By using these patterns, the original functional programming problem is transformed to a simple power allocation problem, which can be solved using the classical water-filling approach with reduced complexity. Our numerical results validate the effectiveness of the proposed designs and reveal that: i) compared to the state-of-the-art Fourier-based discretization approach, the proposed CoV-based approach not only improves communication performance but also reduces computational complexity by up to hundreds of times for large CAPA apertures and high frequencies, and ii) the proposed Corr-ZF approach achieves asymptotically optimal performance compared to the CoV-based approach.
Abstract:This article targets at unlocking the potentials of a class of prominent generative artificial intelligence (GAI) method, namely diffusion model (DM), for mobile communications. First, a DM-driven communication architecture is proposed, which introduces two key paradigms, i.e., conditional DM and DMdriven deep reinforcement learning (DRL), for wireless data generation and communication management, respectively. Then, we discuss the key advantages of DM-driven communication paradigms. To elaborate further, we explore DM-driven channel generation mechanisms for channel estimation, extrapolation, and feedback in multiple-input multiple-output (MIMO) systems. We showcase the numerical performance of conditional DM using the accurate DeepMIMO channel datasets, revealing its superiority in generating high-fidelity channels and mitigating unforeseen distribution shifts in sophisticated scenes. Furthermore, several DM-driven communication management designs are conceived, which is promising to deal with imperfect channels and taskoriented communications. To inspire future research developments, we highlight the potential applications and open research challenges of DM-driven communications. Code is available at https://github.com/xiaoxiaxusummer/GAI_COMM/
Abstract:Deep learning is widely used in wireless communications but struggles with fixed neural network sizes, which limit their adaptability in environments where the number of users and antennas varies. To overcome this, this paper introduced a generalization strategy for precoding and power allocation in scalable wireless networks. Initially, we employ an innovative approach to abstract the wireless network into a homogeneous graph. This primarily focuses on bypassing the heterogeneous features between transmitter (TX) and user entities to construct a virtual homogeneous graph serving optimization objectives, thereby enabling all nodes in the virtual graph to share the same neural network. This "TX entity" is known as a base station (BS) in cellular networks and an access point (AP) in cell-free networks. Subsequently, we design a universal graph neural network, termed the information carrying graph neural network (ICGNN), to capture and integrate information from this graph, maintaining permutation invariance. Lastly, using ICGNN as the core algorithm, we tailor the neural network's input and output for specific problem requirements and validate its performance in two scenarios: 1) in cellular networks, we develop a matrix-inverse-free multi-user multi-input multi-output (MU-MIMO) precoding scheme using the conjugate gradient (CG) method, adaptable to varying user and antenna numbers; 2) in a cell-free network, facing dynamic variations in the number of users served by APs, the number of APs serving each user, and the number of antennas per AP, we propose a universal power allocation scheme. Simulations demonstrate that the proposed approach not only significantly reduces computational complexity but also achieves, and potentially exceeds, the spectral efficiency (SE) of conventional algorithms.
Abstract:In this paper, we propose a novel symbiotic sensing and communication (SSAC) framework, comprising a base station (BS) and a passive sensing node. In particular, the BS transmits communication waveform to serve vehicle users (VUEs), while the sensing node is employed to execute sensing tasks based on the echoes in a bistatic manner, thereby avoiding the issue of self-interference. Besides the weak target of interest, the sensing node tracks VUEs and shares sensing results with BS to facilitate sensing-assisted beamforming. By considering both fully digital arrays and hybrid analog-digital (HAD) arrays, we investigate the beamforming design in the SSAC system. We first derive the Cramer-Rao lower bound (CRLB) of the two-dimensional angles of arrival estimation as the sensing metric. Next, we formulate an achievable sum rate maximization problem under the CRLB constraint, where the channel state information is reconstructed based on the sensing results. Then, we propose two penalty dual decomposition (PDD)-based alternating algorithms for fully digital and HAD arrays, respectively. Simulation results demonstrate that the proposed algorithms can achieve an outstanding data rate with effective localization capability for both VUEs and the weak target. In particular, the HAD beamforming design exhibits remarkable performance gain compared to conventional schemes, especially with fewer radio frequency chains.
Abstract:The performance of multiplexing and diversity achieved by continuous aperture arrays (CAPAs) over fading channels is analyzed. Angular-domain fading models are derived for CAPA-based multiple-input single-output (MISO), single-input multiple-output (SIMO), and multiple-input multiple-output (MIMO) channels using the Fourier relationship between the spatial response and its angular-domain counterpart. Building on these models, angular-domain transmission frameworks are proposed to facilitate CAPA-based communications, under which the performance of multiplexing and diversity is analyzed. 1) For SIMO and MISO channels, closed-form expressions are derived for the average data rate (ADR) and outage probability (OP). Additionally, asymptotic analyses are performed in the high signal-to-noise ratio (SNR) regime to unveil the maximal multiplexing gain and maximal diversity gain. The diversity-multiplexing trade-off (DMT) is also characterized, along with the array gain within the DMT framework. 2) For MIMO channels, high-SNR approximations are derived for the ADR and OP, based on which the DMT and associated array gain are revealed. The performance of CAPAs is further compared with that of conventional spatially discrete arrays (SPDAs) to highlight the superiority of CAPAs. The analytical and numerical results demonstrate that: i) compared to SPDAs, CAPAs achieve a lower OP and higher ADR, resulting in better spectral efficiency; ii) CAPAs achieve the same DMT as SPDAs with half-wavelength antenna spacing while attaining a larger array gain; and iii) CAPAs achieve a better DMT than SPDAs with antenna spacing greater than half a wavelength.
Abstract:The secrecy performance in both near-field and far-field communications is analyzed using two fundamental metrics: the secrecy capacity under a power constraint and the minimum power requirement to achieve a specified secrecy rate target. 1) For the secrecy capacity, a closed-form expression is derived under a discrete-time memoryless setup. This expression is further analyzed under several far-field and near-field channel models, and the capacity scaling law is revealed by assuming an infinitely large transmit array and an infinitely high power. A novel concept of "depth of insecurity" is proposed to evaluate the secrecy performance achieved by near-field beamfocusing. It is demonstrated that increasing the number of transmit antennas reduces this depth and thus improves the secrecy performance. 2) Regarding the minimum required power, a closed-form expression is derived and analyzed within far-field and near-field scenarios. Asymptotic analyses are performed by setting the number of transmit antennas to infinity to unveil the power scaling law. Numerical results are provided to demonstrate that: i) compared to far-field communications, near-field communications expand the areas where secure transmission is feasible, specifically when the eavesdropper is located in the same direction as the intended receiver; ii) as the number of transmit antennas increases, neither the secrecy capacity nor the minimum required power scales or vanishes unboundedly, adhering to the principle of energy conservation.
Abstract:The continuous aperture array (CAPA) can provide higher degree-of-freedom and spatial resolution than the spatially discrete array (SDPA), where optimizing multi-user current distributions in CAPA systems is crucial but challenging. The challenge arises from solving non-convex functional optimization problems without closed-form objective functions and constraints. In this paper, we propose a deep learning framework called L-CAPA to learn current distribution policies. In the framework, we find finite-dimensional representations of channel functions and current distributions, allowing them to be inputted into and outputted from a deep neural network (DNN) for learning the policy. To address the issue that the integrals in the loss function without closed-form expressions hinder training the DNN in an unsupervised manner, we propose to design another two DNNs for learning the integrals. The DNNs are designed as graph neural networks to incorporate with the permutation properties of the mappings to be learned, thereby improving learning performance. Simulation results show that L-CAPA can achieve the performance upper-bound of optimizing precoding in the SDPA system as the number of antennas approaches infinity, and it is with low inference complexity.
Abstract:A novel low-complexity wavenumber-domain method is proposed for near-field sensing (NISE). Specifically, the power-concentrated region of the wavenumber-domain channels is related to the target position in a non-linear manner. Based on this observation, a bi-directional convolutional neural network (BiCNN)-based approach is proposed to capture such a relationship, thereby facilitating low-complexity target localization. This method enables direct and gridless target localization using only a limited bandwidth and a single antenna array. Simulation results demonstrate that: 1) during the offline training phase, the proposed BiCNN method can learn to localize the target with fewer trainable parameters compared to the naive neural network architectures; and 2) during the online implementation phase, the BiCNN method can spend 100x less time while maintaining comparable performance to the conventional two-dimensional multiple signal classification (MUSIC) algorithms.