Abstract:In the upcoming vehicular networks, reconfigurable intelligent surfaces (RISs) are considered as a key enabler of user self-localization without the intervention of the access points (APs). In this paper, we investigate the feasibility of RIS-enabled self-localization with no APs. We first develop a digital signal processing (DSP) unit for estimating the geometric parameters such as the angle, distance, and velocity and for RIS-enabled self-localization. Second, we set up an experimental testbed consisting of a Texas Instrument frequency modulated continuous wave (FMCW) radar for the user and SilversIMA module for the RIS. Our results confirm the validity of the developed DSP unit and demonstrate the feasibility of RIS-enabled self-localization.
Abstract:In recent years, vision transformers with text decoder have demonstrated remarkable performance on Scene Text Recognition (STR) due to their ability to capture long-range dependencies and contextual relationships with high learning capacity. However, the computational and memory demands of these models are significant, limiting their deployment in resource-constrained applications. To address this challenge, we propose an efficient and accurate STR system. Specifically, we focus on improving the efficiency of encoder models by introducing a cascaded-transformers structure. This structure progressively reduces the vision token size during the encoding step, effectively eliminating redundant tokens and reducing computational cost. Our experimental results confirm that our STR system achieves comparable performance to state-of-the-art baselines while substantially decreasing computational requirements. In particular, for large-models, the accuracy remains same, 92.77 to 92.68, while computational complexity is almost halved with our structure.
Abstract:Scaling architectures have been proven effective for improving Scene Text Recognition (STR), but the individual contribution of vision encoder and text decoder scaling remain under-explored. In this work, we present an in-depth empirical analysis and demonstrate that, contrary to previous observations, scaling the decoder yields significant performance gains, always exceeding those achieved by encoder scaling alone. We also identify label noise as a key challenge in STR, particularly in real-world data, which can limit the effectiveness of STR models. To address this, we propose Cloze Self-Distillation (CSD), a method that mitigates label noise by distilling a student model from context-aware soft predictions and pseudolabels generated by a teacher model. Additionally, we enhance the decoder architecture by introducing differential cross-attention for STR. Our methodology achieves state-of-the-art performance on 10 out of 11 benchmarks using only real data, while significantly reducing the parameter size and computational costs.
Abstract:Localization and tracking are critical components of integrated sensing and communication (ISAC) systems, enhancing resource management, beamforming accuracy, and overall system reliability through precise sensing. Due to the high path loss of the high-frequency systems, antenna arrays are required at the transmitter and receiver sides for beamforming gain. However, beam misalignment may occur, which requires accurate tracking of the six-dimensional (6D) state, namely, 3D position and 3D orientation. In this work, we first address the challenge that the rotation matrix, being part of the Lie group rather than Euclidean space, necessitates the derivation of the ICRB for an intrinsic performance benchmark. Then, leveraging the derived ICRB, we develop two filters-one utilizing pose fusion and the other employing error-state Kalman filter to estimate the UE's 6D state for different computational resource consumption and accuracy requirements. Simulation results validate the ICRB and assess the performance of the proposed filters, demonstrating their effectiveness and improved accuracy in 6D state tracking.
Abstract:Ensuring positioning integrity amid faulty measurements is crucial for safety-critical applications, making receiver autonomous integrity monitoring (RAIM) indispensable. This paper introduces a Bayesian RAIM algorithm with a streamlined architecture for snapshot-type 3D cellular positioning. Unlike traditional frequentist-type RAIM algorithms, it computes the exact posterior probability density function (PDF) of the position vector as a Gaussian mixture (GM) model using efficient message passing along a factor graph. This Bayesian approach retains all crucial information from the measurements, eliminates the need to discard faulty measurements, and results in tighter protection levels (PLs) in 3D space and 1D/2D subspaces that meet target integrity risk (TIR) requirements. Numerical simulations demonstrate that the Bayesian RAIM algorithm significantly outperforms a baseline algorithm, achieving over $50\%$ PL reduction at a comparable computational cost.
Abstract:This paper analyzes monostatic sensing by a user equipment (UE) for a setting in which the UE is unable to resolve multiple targets due to their interference within a single resolution bin. It is shown how sensing accuracy, in terms of both detection rate and localization accuracy, can be boosted by a reconfigurable intelligent surface (RIS), which can be advantageously used to provide signal diversity and aid in resolving the targets. Specifically, assuming prior information on the presence of a cluster of targets, a RIS beam sweep procedure is used to facilitate the high resolution sensing. We derive the Cram\'er-Rao lower bounds (CRLBs) for channel parameter estimation and sensing and an upper bound on the detection probability. The concept of coherence is defined and analyzed theoretically. Then, we propose an orthogonal matching pursuit (OMP) channel estimation algorithm combined with data association to fuse the information of the non-RIS signal and the RIS signal and perform sensing. Finally, we provide numerical results to verify the potential of RIS for improving sensor resolution, and to demonstrate that the proposed methods can realize this potential for RIS-assisted high resolution sensing.
Abstract:Simultaneous localization and mapping (SLAM) methods need to both solve the data association (DA) problem and the joint estimation of the sensor trajectory and the map, conditioned on a DA. In this paper, we propose a novel integrated approach to solve both the DA problem and the batch SLAM problem simultaneously, combining random finite set (RFS) theory and the graph-based SLAM approach. A sampling method based on the Poisson multi-Bernoulli mixture (PMBM) density is designed for dealing with the DA uncertainty, and a graph-based SLAM solver is applied for the conditional SLAM problem. In the end, a post-processing approach is applied to merge SLAM results from different iterations. Using synthetic data, it is demonstrated that the proposed SLAM approach achieves performance close to the posterior Cram\'er-Rao bound, and outperforms state-of-the-art RFS-based SLAM filters in high clutter and high process noise scenarios.
Abstract:Future generations of mobile networks call for concurrent sensing and communication functionalities in the same hardware and/or spectrum. Compared to communication, sensing services often suffer from limited coverage, due to the high path loss of the reflected signal and the increased infrastructure requirements. To provide a more uniform quality of service, distributed multiple input multiple output (D-MIMO) systems deploy a large number of distributed nodes and efficiently control them, making distributed integrated sensing and communications (ISAC) possible. In this paper, we investigate ISAC in D-MIMO through the lens of different design architectures and deployments, revealing both conflicts and synergies. In addition, simulation and demonstration results reveal both opportunities and challenges towards the implementation of ISAC in D-MIMO.
Abstract:Millimeter-wave (mmWave) signals provide attractive opportunities for sensing due to their inherent geometrical connections to physical propagation channels. Two common modalities used in mmWave sensing are monostatic and bistatic sensing, which are usually considered separately. By integrating these two modalities, information can be shared between them, leading to improved sensing performance. In this paper, we investigate the integration of monostatic and bistatic sensing in a 5G mmWave scenario, implement the extended Kalman-Poisson multi-Bernoulli sequential filters to solve the sensing problems, and propose a method to periodically fuse user states and maps from two sensing modalities.
Abstract:Belief propagation (BP) is a useful probabilistic inference algorithm for efficiently computing approximate marginal probability densities of random variables. However, in its standard form, BP is applicable to only the vector-type random variables, while certain applications rely on set-type random variables with an unknown number of vector elements. In this paper, we first develop BP rules for set-type random variables and demonstrate that vector-type BP is a special case of set-type BP. We further propose factor graphs with set-factor and set-variable nodes by devising the set-factor nodes that can address the set-variables with random elements and cardinality, while the number of vector elements in vector-type is known. To demonstrate the validity of developed set-type BP, we apply it to the Poisson multi-Bernoulli (PMB) filter for simultaneous localization and mapping (SLAM), which naturally leads to a new set-type BP-SLAM filter. Finally, we reveal connections between the vector-type BP-SLAM filter and the proposed set-type BP-SLAM filter and show a performance gain of the proposed set-type BP-SLAM filter in comparison with the vector-type BP-SLAM filter.