Abstract:Reinforcement learning (RL) has proved to have a promising role in future intelligent wireless networks. Online RL has been adopted for radio resource management (RRM), taking over traditional schemes. However, due to its reliance on online interaction with the environment, its role becomes limited in practical, real-world problems where online interaction is not feasible. In addition, traditional RL stands short in front of the uncertainties and risks in real-world stochastic environments. In this manner, we propose an offline and distributional RL scheme for the RRM problem, enabling offline training using a static dataset without any interaction with the environment and considering the sources of uncertainties using the distributions of the return. Simulation results demonstrate that the proposed scheme outperforms conventional resource management models. In addition, it is the only scheme that surpasses online RL and achieves a $16 \%$ gain over online RL.
Abstract:Long Range - Frequency Hopping Spread Spectrum (LR-FHSS) is an emerging and promising technology recently introduced into the LoRaWAN protocol specification for both terrestrial and non-terrestrial networks, notably satellites. The higher capacity, long-range and robustness to Doppler effect make LR-FHSS a primary candidate for direct-to-satellite (DtS) connectivity for enabling Internet-of-things (IoT) in remote areas. The LR-FHSS devices envisioned for DtS IoT will be primarily battery-powered. Therefore, it is crucial to investigate the current consumption characteristics and Time-on-Air (ToA) of LR-FHSS technology. However, to our knowledge, no prior research has presented the accurate ToA and current consumption models for this newly introduced scheme. This paper addresses this shortcoming through extensive field measurements and the development of analytical models. Specifically, we have measured the current consumption and ToA for variable transmit power, message payload, and two new LR-FHSS-based Data Rates (DR8 and DR9). We also develop current consumption and ToA analytical models demonstrating a strong correlation with the measurement results exhibiting a relative error of less than 0.3%. Thus, it confirms the validity of our models. Conversely, the existing analytical models exhibit a higher relative error rate of -9.2 to 3.4% compared to our measurement results. The presented in this paper results can be further used for simulators or in analytical studies to accurately model the on-air time and energy consumption of LR-FHSS devices.
Abstract:Multi-User Multiple-Input Multiple-Output (MU-MIMO) is a pivotal technology in present-day wireless communication systems. In such systems, a base station or Access Point (AP) is equipped with multiple antenna elements and serves multiple active devices simultaneously. Nevertheless, most of the works evaluating the performance of MU-MIMO systems consider APs with static antenna arrays, that is, without any movement capability. Recently, the idea of APs and antenna arrays that are able to move have gained traction among the research community. Many works evaluate the communications performance of antenna systems able to move on the horizontal plane. However, such APs require a very bulky, complex and expensive movement system. In this work, we propose a simpler and cheaper alternative: the utilization of rotary APs, i.e. APs that can rotate. We also analyze the performance of a system in which the AP is able to both move and rotate. The movements and/or rotations of the APs are computed in order to maximize the mean per-user achievable spectral efficiency, based on estimates of the locations of the active devices and using particle swarm optimization. We adopt a spatially correlated Rician fading channel model, and evaluate the resulting optimized performance of the different setups in terms of mean per-user achievable spectral efficiencies. Our numerical results show that both the optimal rotations and movements of the APs can provide substantial performance gains when the line-of-sight components of the channel vectors are strong. Moreover, the simpler rotary APs can outperform the movable APs when their movement area is constrained.
Abstract:Efficient Random Access (RA) is critical for enabling reliable communication in Industrial Internet of Things (IIoT) networks. Herein, we propose a deep reinforcement learning based distributed RA scheme, entitled Neural Network-Based Bandit (NNBB), for the IIoT alarm scenario. In such a scenario, the devices may detect a common critical event, and the goal is to ensure the alarm information is delivered successfully from at least one device. The proposed NNBB scheme is implemented at each device, where it trains itself online and establishes implicit inter-device coordination to achieve the common goal. Devices can transmit simultaneously on multiple orthogonal channels and each possible transmission pattern constitutes a possible action for the NNBB, which uses a deep neural network to determine the action. Our simulation results show that as the number of devices in the network increases, so does the performance gain of the NNBB compared to the Multi-Armed Bandit (MAB) RA benchmark. For instance, NNBB experiences a 7% success rate drop when there are four channels and the number of devices increases from 10 to 60, while MAB faces a 25% drop.
Abstract:The Fifth-Generation (5G) wireless communications networks introduced native support for Machine-Type Communications (MTC) use cases. Nevertheless, current 5G networks cannot fully meet the very stringent requirements regarding latency, reliability, and number of connected devices of most MTC use cases. Industry and academia have been working on the evolution from 5G to Sixth Generation (6G) networks. One of the main novelties is adopting Distributed Multiple-Input Multiple-Output (D-MIMO) networks. However, most works studying D-MIMO consider antenna arrays with no movement capabilities, even though some recent works have shown that this could bring substantial performance improvements. In this work, we propose the utilization of Access Points (APs) equipped with Rotary Uniform Linear Arrays (RULAs) for this purpose. Considering a spatially correlated Rician fading model, the optimal angular position of the RULAs is jointly computed by the central processing unit using particle swarm optimization as a function of the position of the active devices. Considering the impact of imperfect positioning estimates, our numerical results show that the RULAs's optimal rotation brings substantial performance gains in terms of mean per-user spectral efficiency. The improvement grows with the strength of the line-of-sight components of the channel vectors. Given the total number of antenna elements, we study the trade-off between the number of APs and the number of antenna elements per AP, revealing an optimal number of APs for the cases of APs equipped with static ULAs and RULAs.
Abstract:The Internet of Things paradigm heavily relies on a network of a massive number of machine-type devices (MTDs) that monitor changes in various phenomena. Consequently, MTDs are randomly activated at different times whenever a change occurs. This essentially results in relatively few MTDs being active simultaneously compared to the entire network, resembling targeted sampling in compressed sensing. Therefore, signal recovery in machine-type communications is addressed through joint user activity detection and channel estimation algorithms built using compressed sensing theory. However, most of these algorithms follow a two-stage procedure in which a channel is first estimated and later mapped to find active users. This approach is inefficient because the estimated channel information is subsequently discarded. To overcome this limitation, we introduce a novel covariance-learning matching pursuit algorithm that bypasses explicit channel estimation. Instead, it focuses on estimating the indices of the active users greedily. Simulation results presented in terms of probability of miss detection, exact recovery rate, and computational complexity validate the proposed technique's superior performance and efficiency.
Abstract:Digital twin (DT) platforms are increasingly regarded as a promising technology for controlling, optimizing, and monitoring complex engineering systems such as next-generation wireless networks. An important challenge in adopting DT solutions is their reliance on data collected offline, lacking direct access to the physical environment. This limitation is particularly severe in multi-agent systems, for which conventional multi-agent reinforcement (MARL) requires online interactions with the environment. A direct application of online MARL schemes to an offline setting would generally fail due to the epistemic uncertainty entailed by the limited availability of data. In this work, we propose an offline MARL scheme for DT-based wireless networks that integrates distributional RL and conservative Q-learning to address the environment's inherent aleatoric uncertainty and the epistemic uncertainty arising from limited data. To further exploit the offline data, we adapt the proposed scheme to the centralized training decentralized execution framework, allowing joint training of the agents' policies. The proposed MARL scheme, referred to as multi-agent conservative quantile regression (MA-CQR) addresses general risk-sensitive design criteria and is applied to the trajectory planning problem in drone networks, showcasing its advantages.
Abstract:Contemporary wireless communication systems rely on Multi-User Multiple-Input Multiple-Output (MU-MIMO) techniques. In such systems, each Access Point (AP) is equipped with multiple antenna elements and serves multiple devices simultaneously. Notably, traditional systems utilize fixed antennas, i.e., antennas without any movement capabilities, while the idea of movable antennas has recently gained traction among the research community. By moving in a confined region, movable antennas are able to exploit the wireless channel variation in the continuous domain. This additional degree of freedom may enhance the quality of the wireless links, and consequently the communication performance. However, movable antennas for MU-MIMO proposed in the literature are complex, bulky, expensive and present a high power consumption. In this paper, we propose an alternative to such systems that has lower complexity and lower cost. More specifically, we propose the incorporation of rotation capabilities to APs equipped with Uniform Linear Arrays (ULAs) of antennas. We consider the uplink of an indoor scenario where the AP serves multiple devices simultaneously. The optimal rotation of the ULA is computed based on estimates of the positions of the active devices and aiming at maximizing the per-user mean achievable Spectral Efficiency (SE). Adopting a spatially correlated Rician channel model, our numerical results show that the rotation capabilities of the AP can bring substantial improvements in the SE in scenarios where the line-of-sight component of the channel vectors is strong. Moreover, our proposed system is robust against imperfect positioning estimates.
Abstract:The age of information (AoI) is used to measure the freshness of the data. In IoT networks, the traditional resource management schemes rely on a message exchange between the devices and the base station (BS) before communication which causes high AoI, high energy consumption, and low reliability. Unmanned aerial vehicles (UAVs) as flying BSs have many advantages in minimizing the AoI, energy-saving, and throughput improvement. In this paper, we present a novel learning-based framework that estimates the traffic arrival of IoT devices based on Markovian events. The learning proceeds to optimize the trajectory of multiple UAVs and their scheduling policy. First, the BS predicts the future traffic of the devices. We compare two traffic predictors: the forward algorithm (FA) and the long short-term memory (LSTM). Afterward, we propose a deep reinforcement learning (DRL) approach to optimize the optimal policy of each UAV. Finally, we manipulate the optimum reward function for the proposed DRL approach. Simulation results show that the proposed algorithm outperforms the random-walk (RW) baseline model regarding the AoI, scheduling accuracy, and transmission power.
Abstract:Internet of Things (IoT) sustainability may hinge on radio frequency wireless energy transfer (RF-WET). However, energy-efficient charging strategies are still needed, motivating our work. Specifically, this letter proposes a time division scheme to efficiently charge low-power devices in an IoT network. For this, a multi-antenna power beacon (PB) drives the devices' energy harvesting circuit to the highest power conversion efficiency point via energy beamforming, thus achieving minimum energy consumption. Herein, we adopt the analog multi-antenna architecture due to its low complexity, cost, and energy consumption. The proposal includes a simple yet accurate model for the transfer characteristic of the energy harvesting circuit, enabling the optimization framework. The results evince the effectiveness of our RF-WET strategy over a benchmark scheme where the PB charges all the IoT devices simultaneously. Furthermore, the performance increases with the number of PB antennas.