Abstract:Hybrid light fidelity (LiFi) and wireless fidelity (WiFi) networks are a promising paradigm of heterogeneous network (HetNet), attributed to the complementary physical properties of optical spectra and radio frequency. However, the current development of such HetNets is mostly bottlenecked by the existing transmission control protocol (TCP), which restricts the user equipment (UE) to connecting one access point (AP) at a time. While the ongoing investigation on multipath TCP (MPTCP) can bring significant benefits, it complicates the network topology of HetNets, making the existing load balancing (LB) learning models less effective. Driven by this, we propose a graph neural network (GNN)-based model to tackle the LB problem for MPTCP-enabled HetNets, which results in a partial mesh topology. Such a topology can be modeled as a graph, with the channel state information and data rate requirement embedded as node features, while the LB solutions are deemed as edge labels. Compared to the conventional deep neural network (DNN), the proposed GNN-based model exhibits two key strengths: i) it can better interpret a complex network topology; and ii) it can handle various numbers of APs and UEs with a single trained model. Simulation results show that against the traditional optimisation method, the proposed learning model can achieve near-optimal throughput within a gap of 11.5%, while reducing the inference time by 4 orders of magnitude. In contrast to the DNN model, the new method can improve the network throughput by up to 21.7%, at a similar inference time level.
Abstract:The purpose of RGB-D Salient Object Detection (SOD) is to pinpoint the most visually conspicuous areas within images accurately. While conventional deep models heavily rely on CNN extractors and overlook the long-range contextual dependencies, subsequent transformer-based models have addressed the issue to some extent but introduce high computational complexity. Moreover, incorporating spatial information from depth maps has been proven effective for this task. A primary challenge of this issue is how to fuse the complementary information from RGB and depth effectively. In this paper, we propose a dual Mamba-driven cross-modal fusion network for RGB-D SOD, named MambaSOD. Specifically, we first employ a dual Mamba-driven feature extractor for both RGB and depth to model the long-range dependencies in multiple modality inputs with linear complexity. Then, we design a cross-modal fusion Mamba for the captured multi-modal features to fully utilize the complementary information between the RGB and depth features. To the best of our knowledge, this work is the first attempt to explore the potential of the Mamba in the RGB-D SOD task, offering a novel perspective. Numerous experiments conducted on six prevailing datasets demonstrate our method's superiority over sixteen state-of-the-art RGB-D SOD models. The source code will be released at https://github.com/YueZhan721/MambaSOD.
Abstract:The 3D scene editing method based on neural implicit field has gained wide attention. It has achieved excellent results in 3D editing tasks. However, existing methods often blend the interaction between objects and scene environment. The change of scene appearance like shadows is failed to be displayed in the rendering view. In this paper, we propose an Object and Scene environment Interaction aware (OSI-aware) system, which is a novel two-stream neural rendering system considering object and scene environment interaction. To obtain illuminating conditions from the mixture soup, the system successfully separates the interaction between objects and scene environment by intrinsic decomposition method. To study the corresponding changes to the scene appearance from object-level editing tasks, we introduce a depth map guided scene inpainting method and shadow rendering method by point matching strategy. Extensive experiments demonstrate that our novel pipeline produce reasonable appearance changes in scene editing tasks. It also achieve competitive performance for the rendering quality in novel-view synthesis tasks.
Abstract:Generalized optical multiple-input multiple-output (GOMIMO) techniques have been recently shown to be promising for high-speed optical wireless communication (OWC) systems. In this paper, we propose a novel deep learning-aided GOMIMO (DeepGOMIMO) framework for GOMIMO systems, where channel state information (CSI)-free blind detection can be enabled by employing a specially designed deep neural network (DNN)-based MIMO detector. The CSI-free blind DNN detector mainly consists of two modules: one is the pre-processing module which is designed to address both the path loss and channel crosstalk issues caused by MIMO transmission, and the other is the feed-forward DNN module which is used for joint detection of spatial and constellation information by learning the statistics of both the input signal and the additive noise. Our simulation results clearly verify that, in a typical indoor 4 $\times$ 4 MIMO-OWC system using both generalized optical spatial modulation (GOSM) and generalized optical spatial multiplexing (GOSMP) with unipolar non-zero 4-ary pulse amplitude modulation (4-PAM) modulation, the proposed CSI-free blind DNN detector achieves near the same bit error rate (BER) performance as the optimal joint maximum-likelihood (ML) detector, but with much reduced computational complexity. Moreover, since the CSI-free blind DNN detector does not require instantaneous channel estimation to obtain accurate CSI, it enjoys the unique advantages of improved achievable data rate and reduced communication time delay in comparison to the CSI-based zero-forcing DNN (ZF-DNN) detector.
Abstract:Optical wireless communication (OWC) is considered to be a promising technology which will alleviate traffic burden caused by the increasing number of mobile devices. In this study, a novel vertical-cavity surface-emitting laser (VCSEL) array is proposed for indoor OWC systems. To activate the best beam for a mobile user, two beam activation methods are proposed for the system. The method based on a corner-cube retroreflector (CCR) provides very low latency and allows real-time activation for high-speed users. The other method uses the omnidirectional transmitter (ODTx). The ODTx can serve the purpose of uplink transmission and beam activation simultaneously. Moreover, systems with ODTx are very robust to the random orientation of a user equipment (UE). System level analyses are carried out for the proposed VCSEL array system. For a single user scenario, the probability density function (PDF) of the signal-to-noise ratio (SNR) for the central beam of the VCSEL array system can be approximated as a uniform distribution. In addition, the average data rate of the central beam and its upper bound are given analytically and verified by Monte-Carlo simulations. For a multi-user scenario, an analytical upper bound for the average data rate is given. The effects of the cell size and the full width at half maximum (FWHM) angle on the system performance are studied. The results show that the system with a FWHM angle of $4^\circ$ outperforms the others.
Abstract:Light-fidelity (LiFi) is an emerging technology for high-speed short-range mobile communications. Inter-cell interference (ICI) is an important issue that limits the system performance in an optical attocell network. Angle diversity receivers (ADRs) have been proposed to mitigate ICI. In this paper, the structure of pyramid receivers (PRs) and truncated pyramid receivers (TPRs) are studied. The coverage problems of PRs and TPRs are defined and investigated, and the lower bound of field of view (FOV) for each PD is given analytically. The impact of random device orientation and diffuse link signal propagation are taken into consideration. The performances of PRs and TPRs are compared and then optimized ADR structures are proposed. The performance comparison between the select best combining (SBC) and maximum ratio combining (MRC) is given under different noise levels. It is shown that SBC will outperform MRC in an interference limited system, otherwise, MRC is a preferred scheme. In addition, the double source system, where each LiFi AP consists of two sources transmitting the same information signals but with opposite polarity, is proved to outperform the single source (SS) system under certain conditions.