Abstract:This paper reveals the potential of movable antennas in enhancing anti-jamming communication. We consider a legitimate communication link in the presence of multiple jammers and propose deploying a movable antenna array at the receiver to combat jamming attacks. We formulate the problem as a signal-to-interference-plus-noise ratio maximization, by jointly optimizing the receive beamforming and antenna element positioning. Due to the non-convexity and multi-fold difficulties from an optimization perspective, we develop a deep learning-based framework where beamforming is tackled as a Rayleigh quotient problem, while antenna positioning is addressed through multi-layer perceptron training. The neural network parameters are optimized using stochastic gradient descent to achieve effective jamming mitigation strategy, featuring offline training with marginal complexity for online inference. Numerical results demonstrate that the proposed approach achieves near-optimal anti-jamming performance thereby significantly improving the efficiency in strategy determination.
Abstract:Recent advancements in large language models (LLMs) have shown promise in generating psychotherapeutic dialogues, especially in Motivational Interviewing (MI). However, how to employ strategies, a set of motivational interviewing (MI) skills, to generate therapeutic-adherent conversations with explainability is underexplored. We propose an approach called strategy-aware dialogue generation with Chain-of-Strategy (CoS) planning, which first predicts MI strategies as reasoning and utilizes these strategies to guide the subsequent dialogue generation. It brings the potential for controllable and explainable generation in psychotherapy by aligning the generated MI dialogues with therapeutic strategies. Extensive experiments including automatic and human evaluations are conducted to validate the effectiveness of the MI strategy. Our findings demonstrate the potential of LLMs in producing strategically aligned dialogues and suggest directions for practical applications in psychotherapeutic settings.
Abstract:3D Gaussian Splatting showcases notable advancements in photo-realistic and real-time novel view synthesis. However, it faces challenges in modeling mirror reflections, which exhibit substantial appearance variations from different viewpoints. To tackle this problem, we present MirrorGaussian, the first method for mirror scene reconstruction with real-time rendering based on 3D Gaussian Splatting. The key insight is grounded on the mirror symmetry between the real-world space and the virtual mirror space. We introduce an intuitive dual-rendering strategy that enables differentiable rasterization of both the real-world 3D Gaussians and the mirrored counterpart obtained by reflecting the former about the mirror plane. All 3D Gaussians are jointly optimized with the mirror plane in an end-to-end framework. MirrorGaussian achieves high-quality and real-time rendering in scenes with mirrors, empowering scene editing like adding new mirrors and objects. Comprehensive experiments on multiple datasets demonstrate that our approach significantly outperforms existing methods, achieving state-of-the-art results. Project page: https://mirror-gaussian.github.io/.
Abstract:In this paper, we propose a 3D geometry-aware deformable Gaussian Splatting method for dynamic view synthesis. Existing neural radiance fields (NeRF) based solutions learn the deformation in an implicit manner, which cannot incorporate 3D scene geometry. Therefore, the learned deformation is not necessarily geometrically coherent, which results in unsatisfactory dynamic view synthesis and 3D dynamic reconstruction. Recently, 3D Gaussian Splatting provides a new representation of the 3D scene, building upon which the 3D geometry could be exploited in learning the complex 3D deformation. Specifically, the scenes are represented as a collection of 3D Gaussian, where each 3D Gaussian is optimized to move and rotate over time to model the deformation. To enforce the 3D scene geometry constraint during deformation, we explicitly extract 3D geometry features and integrate them in learning the 3D deformation. In this way, our solution achieves 3D geometry-aware deformation modeling, which enables improved dynamic view synthesis and 3D dynamic reconstruction. Extensive experimental results on both synthetic and real datasets prove the superiority of our solution, which achieves new state-of-the-art performance. The project is available at https://npucvr.github.io/GaGS/
Abstract:Existing NeRF-based methods for large scene reconstruction often have limitations in visual quality and rendering speed. While the recent 3D Gaussian Splatting works well on small-scale and object-centric scenes, scaling it up to large scenes poses challenges due to limited video memory, long optimization time, and noticeable appearance variations. To address these challenges, we present VastGaussian, the first method for high-quality reconstruction and real-time rendering on large scenes based on 3D Gaussian Splatting. We propose a progressive partitioning strategy to divide a large scene into multiple cells, where the training cameras and point cloud are properly distributed with an airspace-aware visibility criterion. These cells are merged into a complete scene after parallel optimization. We also introduce decoupled appearance modeling into the optimization process to reduce appearance variations in the rendered images. Our approach outperforms existing NeRF-based methods and achieves state-of-the-art results on multiple large scene datasets, enabling fast optimization and high-fidelity real-time rendering.
Abstract:The assembly instruction is a mandatory component of Lego-like brick sets.The conventional production of assembly instructions requires a considerable amount of manual fine-tuning, which is intractable for casual users and customized brick sets.Moreover, the traditional paper-based instructions lack expressiveness and interactivity.To tackle the two problems above, we present BrickPal, an augmented reality-based system, which visualizes assembly instructions in an augmented reality head-mounted display. It utilizes Natural Language Processing (NLP) techniques to generate plausible assembly sequences, and provide real-time guidance in the AR headset.Our user study demonstrates BrickPal's effectiveness at assisting users in brick assembly compared to traditional assembly methods. Additionally, the NLP algorithm-generated assembly sequences achieve the same usability with manually adapted sequences.
Abstract:Visible light communication (VLC) has been widely applied as a promising solution for modern short range communication. When it comes to the deployment of LED arrays in VLC networks, the emerging ultra-dense network (UDN) technology can be adopted to expand the VLC network's capacity. However, the problem of inter-cell interference (ICI) mitigation and efficient power control in the VLC-based UDN is still a critical challenge. To this end, a reinforcement learning (RL) based VLC UDN architecture is devised in this paper. The deployment of the cells is optimized via spatial reuse to mitigate ICI. An RL-based algorithm is proposed to dynamically optimize the policy of power and interference control, maximizing the system utility in the complicated and dynamic environment. Simulation results demonstrate the superiority of the proposed scheme, it increase the system utility and achievable data rate while reducing the energy consumption and ICI, which outperforms the benchmark scheme.
Abstract:Efficient data processing and computation are essential for the industrial Internet of things (IIoT) to empower various applications, which yet can be significantly bottlenecked by the limited energy capacity and computation capability of the IIoT nodes. In this paper, we employ an unmanned aerial vehicle (UAV) as an edge server to assist IIoT data processing, while considering the practical issue of UAV jittering. Specifically, we propose a joint design on trajectory and offloading strategies to minimize energy consumption due to local and edge computation, as well as data transmission. We particularly address the UAV jittering that induces Gaussian-distributed uncertainties associated with flying waypoints, resulting in probabilistic-form flying speed and data offloading constraints. We exploit the Bernstein-type inequality to reformulate the constraints in deterministic forms and decompose the energy minimization to solve for trajectory and offloading separately within an alternating optimization framework. The subproblems are then tackled with the successive convex approximation technique. Simulation results show that our proposal strictly guarantees robustness under uncertainties and effectively reduces energy consumption as compared with the baselines.
Abstract:Massive random access of devices in the emerging Open Radio Access Network (O-RAN) brings great challenge to the access control and management. Exploiting the bursting nature of the access requests, sparse active user detection (SAUD) is an efficient enabler towards efficient access management, but the sparsity might be deteriorated in case of uncoordinated massive access requests. To dynamically preserve the sparsity of access requests, a reinforcement-learning (RL)-assisted scheme of closed-loop access control utilizing the access class barring technique is proposed, where the RL policy is determined through continuous interaction between the RL agent, i.e., a next generation node base (gNB), and the environment. The proposed scheme can be implemented by the near-real-time RAN intelligent controller (near-RT RIC) in O-RAN, supporting rapid switching between heterogeneous vertical applications, such as mMTC and uRLLC services. Moreover, a data-driven scheme of deep-RL-assisted SAUD is proposed to resolve highly complex environments with continuous and high-dimensional state and action spaces, where a replay buffer is applied for automatic large-scale data collection. An actor-critic framework is formulated to incorporate the strategy-learning modules into the near-RT RIC. Simulation results show that the proposed schemes can achieve superior performance in both access efficiency and user detection accuracy over the benchmark scheme for different heterogeneous services with massive access requests.
Abstract:Reconfigurable intelligent surfaces (RISs) are recognized with great potential to strengthen wireless security, yet the performance gain largely depends on the deployment location of RISs in the network topology. In this paper, we consider the anti-eavesdropping communication established through a RIS at a fixed location, as well as an aerial platform mounting another RIS and a friendly jammer to further improve the secrecy. The aerial RIS helps enhance the legitimate signal and the aerial cooperative jamming is strengthened through the fixed RIS. The security gain with aerial reflection and jamming is further improved with the optimized deployment of the aerial platform. We particularly consider the imperfect channel state information issue and address the worst-case secrecy for robust performance. The formulated robust secrecy rate maximization problem is decomposed into two layers, where the inner layer solves for reflection and jamming with robust optimization, and the outer layer tackles the aerial deployment through deep reinforcement learning. Simulation results show the deployment under different network topologies and demonstrate the performance superiority of our proposal in terms of the worst-case security provisioning as compared with the baselines.