Abstract:Existing research on non-line-of-sight (NLoS) ultraviolet (UV) channel modeling mainly focuses on scenarios where the signal propagation process is not affected by any obstacle and the radiation intensity (RI) of the light source is uniformly distributed. To eliminate these restrictions, we propose a single-collision model for the NLoS UV channel incorporating a cuboid-shaped obstacle, where the RI of the UV light source is modeled as the Lambertian distribution. For easy interpretation, we categorize the intersection circumstances between the receiver field-of-view and the obstacle into six cases and provide derivations of the weighting factor for each case. To investigate the accuracy of the proposed model, we compare it with the associated Monte Carlo photon tracing model via simulations and experiments. Results verify the correctness of the proposed model. This work reveals that obstacle avoidance is not always beneficial for NLoS UV communications and provides guidelines for relevant system design.
Abstract:Nowadays, Large Language Models (LLMs) have been trained using extended context lengths to foster more creative applications. However, long context training poses great challenges considering the constraint of GPU memory. It not only leads to substantial activation memory consumption during training, but also incurs considerable memory fragmentation. To facilitate long context training, existing frameworks have adopted strategies such as recomputation and various forms of parallelisms. Nevertheless, these techniques rely on redundant computation or extensive communication, resulting in low Model FLOPS Utilization (MFU). In this paper, we propose MEMO, a novel LLM training framework designed for fine-grained activation memory management. Given the quadratic scaling of computation and linear scaling of memory with sequence lengths when using FlashAttention, we offload memory-consuming activations to CPU memory after each layer's forward pass and fetch them during the backward pass. To maximize the swapping of activations without hindering computation, and to avoid exhausting limited CPU memory, we implement a token-wise activation recomputation and swapping mechanism. Furthermore, we tackle the memory fragmentation issue by employing a bi-level Mixed Integer Programming (MIP) approach, optimizing the reuse of memory across transformer layers. Empirical results demonstrate that MEMO achieves an average of 2.42x and 2.26x MFU compared to Megatron-LM and DeepSpeed, respectively. This improvement is attributed to MEMO's ability to minimize memory fragmentation, reduce recomputation and intensive communication, and circumvent the delays associated with the memory reorganization process due to fragmentation. By leveraging fine-grained activation memory management, MEMO facilitates efficient training of 7B LLM with 1 million sequence length on just 8 A800 GPUs, achieving an MFU of 52.30%.
Abstract:Optical intelligent reflecting surface (OIRS) offers a new and effective approach to resolving the line-of-sight blockage issue in visible light communication (VLC) by enabling redirection of light to bypass obstacles, thereby dramatically enhancing indoor VLC coverage and reliability. This article provides a comprehensive overview of OIRS for VLC, including channel modeling, design techniques, and open issues. First, we present the characteristics of OIRS-reflected channels and introduce two practical models, namely, optics model and association model, which are then compared in terms of applicable conditions, configuration methods, and channel parameters. Next, under the more practically appealing association model, we discuss the main design techniques for OIRS-aided VLC systems, including beam alignment, channel estimation, and OIRS reflection optimization. Finally, open issues are identified to stimulate future research in this area.
Abstract:Optical intelligent reflecting surface (OIRS) has attracted increasing attention due to its capability of overcoming signal blockages in visible light communication (VLC), an emerging technology for the next-generation advanced transceivers. However, current works on OIRS predominantly assume known channel state information (CSI), which is essential to practical OIRS configuration. To bridge such a gap, this paper proposes a new and customized channel estimation protocol for OIRSs under the alignment-based channel model. Specifically, we first unveil OIRS spatial and temporal coherence characteristics and derive the coherence distance and the coherence time in closed form. Next, to achieve fast beam alignment over different coherence time, we propose to dynamically tune the rotational angles of the OIRS reflecting elements following a geometric optics-based non-uniform codebook. Given the above beam alignment, we propose an efficient joint space-time sampling-based algorithm to estimate the OIRS channel. In particular, we divide the OIRS into multiple subarrays based on the coherence distance and sequentially estimate their associated CSI, followed by a spacetime interpolation to retrieve full CSI for other non-aligned transceiver antennas. Numerical results validate our theoretical analyses and demonstrate the efficacy of our proposed OIRS channel estimation scheme as compared to other benchmark schemes.
Abstract:Optical intelligent reflecting surface (OIRS) has been considered a promising technology for visible light communication (VLC) by constructing visual line-of-sight propagation paths to address the signal blockage issue. However, the existing works on OIRSs are mostly based on perfect channel state information (CSI), whose acquisition appears to be challenging due to the passive nature of the OIRS. To tackle this challenge, this paper proposes a customized channel estimation algorithm for OIRSs. Specifically, we first unveil the OIRS spatial coherence characteristics and derive the coherence distance in closed form. Based on this property, a spatial sampling-based algorithm is proposed to estimate the OIRS-reflected channel, by dividing the OIRS into multiple subarrays based on the coherence distance and sequentially estimating their associated CSI, followed by an interpolation to retrieve the full CSI. Simulation results validate the derived OIRS spatial coherence and demonstrate the efficacy of the proposed OIRS channel estimation algorithm.
Abstract:Integrated sensing and communication (ISAC) is viewed as a crucial component of future mobile networks and has gained much interest in both academia and industry. Similar to the emergence of radio-frequency (RF) ISAC, the integration of free space optical communication and optical sensing yields optical ISAC (O-ISAC), which is regarded as a powerful complement to its RF counterpart. In this article, we first introduce the generalized system structure of O-ISAC, and then elaborate on three advantages of O-ISAC, i.e., increasing communication rate, enhancing sensing precision, and reducing interference. Next, waveform design and resource allocation of O-ISAC are discussed based on pulse waveform, constant-modulus waveform, and multi-carrier waveform. Furthermore, we put forward future trends and challenges of O-ISAC, which are expected to provide some valuable directions for future research.
Abstract:As one of the six usage scenarios of the sixth generation (6G) mobile communication system, integrated sensing and communication (ISAC) has garnered considerable attention, and numerous studies have been conducted on radio-frequency (RF)-ISAC. Benefitting from the communication and sensing capabilities of an optical system, free space optical (FSO)-ISAC becomes a potential complement to RF-ISAC. In this paper, a direct-current-biased optical orthogonal frequency division multiplexing (DCO-OFDM) scheme is proposed for FSO-ISAC. To derive the spectral efficiency for communication and the Fisher information for sensing as performance metrics, we model the clipping noise of DCO-OFDM as additive colored Gaussian noise to obtain the expression of the signal-to-noise ratio. Based on the derived performance metrics, joint power allocation problems are formulated for both communication-centric and sensing-centric scenarios. In addition, the non-convex joint optimization problems are decomposed into sub-problems for DC bias and subcarriers, which can be solved by block coordinate descent algorithms. Furthermore, numerical simulations demonstrate the proposed algorithms and reveal the trade-off between communication and sensing functionalities of the OFDM-based FSO-ISAC system.
Abstract:Increasing research interests focus on sequential recommender systems, aiming to model dynamic sequence representation precisely. However, the most commonly used loss function in state-of-the-art sequential recommendation models has essential limitations. To name a few, Bayesian Personalized Ranking (BPR) loss suffers the vanishing gradient problem from numerous negative sampling and predictionbiases; Binary Cross-Entropy (BCE) loss subjects to negative sampling numbers, thereby it is likely to ignore valuable negative examples and reduce the training efficiency; Cross-Entropy (CE) loss only focuses on the last timestamp of the training sequence, which causes low utilization of sequence information and results in inferior user sequence representation. To avoid these limitations, in this paper, we propose to calculate Cumulative Cross-Entropy (CCE) loss over the sequence. CCE is simple and direct, which enjoys the virtues of painless deployment, no negative sampling, and effective and efficient training. We conduct extensive experiments on five benchmark datasets to demonstrate the effectiveness and efficiency of CCE. The results show that employing CCE loss on three state-of-the-art models GRU4Rec, SASRec, and S3-Rec can reach 125.63%, 69.90%, and 33.24% average improvement of full ranking NDCG@5, respectively. Using CCE, the performance curve of the models on the test data increases rapidly with the wall clock time, and is superior to that of other loss functions in almost the whole process of model training.
Abstract:With the capability of reconfiguring the wireless electromagnetic environment, intelligent reflecting surface (IRS) is a new paradigm for designing future wireless communication systems. In this paper, we consider optical IRS for improving the performance of visible light communication (VLC) under a multiple-input and multiple-output (MIMO) setting. Specifically, we focus on the downlink communication of an indoor MIMO VLC system and aim to minimize the mean square error (MSE) of demodulated signals at the receiver. To this end, the MIMO channel gain of the IRS-aided VLC is first derived under the point source assumption, based on which the MSE minimization problem is then formulated subject to the emission power constraints. Next, we propose an alternating optimization algorithm, which decomposes the original problem into three subproblems, to iteratively optimize the IRS configuration, the precoding and detection matrices for minimizing the MSE. Moreover, theoretical analysis on the performance of the proposed algorithm in high and low signal-to-noise rate (SNR) regimes is provided, revealing that the joint optimization process can be simplified in such special cases, and the algorithm's convergence property and computational complexity are also discussed. Finally, numerical results show that IRS-aided schemes significantly reduce the MSE as compared to their counterparts without IRS, and the proposed algorithm outperforms other baseline schemes.
Abstract:This paper proposes a novel feature called spectrum congruency for describing edges in images. The spectrum congruency is a generalization of the phase congruency, which depicts how much each Fourier components of the image are congruent in phase. Instead of using fixed bases in phase congruency, the spectrum congruency measures the congruency of the energy distribution of multiscale patches in a data-driven transform domain, which is more adaptable to the input images. Multiscale image patches are used to acquire different frequency components for modeling the local energy and amplitude. The spectrum congruency coincides nicely with human visions of perceiving features and provides a more reliable way of detecting edges. Unlike most existing differential-based multiscale edge detectors which simply combine the multiscale information, our method focuses on exploiting the correlation of the multiscale patches based on their local energy. We test our proposed method on synthetic and real images, and the results demonstrate that our approach is practical and highly robust to noise.