Abstract:We introduce LOcc, an effective and generalizable framework for open-vocabulary occupancy (OVO) prediction. Previous approaches typically supervise the networks through coarse voxel-to-text correspondences via image features as intermediates or noisy and sparse correspondences from voxel-based model-view projections. To alleviate the inaccurate supervision, we propose a semantic transitive labeling pipeline to generate dense and finegrained 3D language occupancy ground truth. Our pipeline presents a feasible way to dig into the valuable semantic information of images, transferring text labels from images to LiDAR point clouds and utimately to voxels, to establish precise voxel-to-text correspondences. By replacing the original prediction head of supervised occupancy models with a geometry head for binary occupancy states and a language head for language features, LOcc effectively uses the generated language ground truth to guide the learning of 3D language volume. Through extensive experiments, we demonstrate that our semantic transitive labeling pipeline can produce more accurate pseudo-labeled ground truth, diminishing labor-intensive human annotations. Additionally, we validate LOcc across various architectures, where all models consistently outperform state-ofthe-art zero-shot occupancy prediction approaches on the Occ3D-nuScenes dataset. Notably, even based on the simpler BEVDet model, with an input resolution of 256 * 704,Occ-BEVDet achieves an mIoU of 20.29, surpassing previous approaches that rely on temporal images, higher-resolution inputs, or larger backbone networks. The code for the proposed method is available at https://github.com/pkqbajng/LOcc.
Abstract:This paper analyzes the impact of pilot-sharing scheme on synchronization performance in a scenario where several slave access points (APs) with uncertain carrier frequency offsets (CFOs) and timing offsets (TOs) share a common pilot sequence. First, the Cramer-Rao bound (CRB) with pilot contamination is derived for pilot-pairing estimation. Furthermore, a maximum likelihood algorithm is presented to estimate the CFO and TO among the pairing APs. Then, to minimize the sum of CRBs, we devise a synchronization strategy based on a pilot-sharing scheme by jointly optimizing the cluster classification, synchronization overhead, and pilot-sharing scheme, while simultaneously considering the overhead and each AP's synchronization requirements. To solve this NP-hard problem, we simplify it into two sub-problems, namely cluster classification problem and the pilot sharing problem. To strike a balance between synchronization performance and overhead, we first classify the clusters by using the K-means algorithm, and propose a criteria to find a good set of master APs. Then, the pilot-sharing scheme is obtained by using the swap-matching operations. Simulation results validate the accuracy of our derivations and demonstrate the effectiveness of the proposed scheme over the benchmark schemes.
Abstract:Resource allocation is conceived for cell-free (CF) massive multi-input multi-output (MIMO)-aided ultra-reliable and low latency communication (URLLC) systems. Specifically, to support multiple devices with limited pilot overhead, pilot reuse among the users is considered, where we formulate a joint pilot length and pilot allocation strategy for maximizing the number of devices admitted. Then, the pilot power and transmit power are jointly optimized while simultaneously satisfying the devices' decoding error probability, latency, and data rate requirements. Firstly, we derive the lower bounds (LBs) of ergodic data rate under finite channel blocklength (FCBL). Then, we propose a novel pilot assignment algorithm for maximizing the number of devices admitted. Based on the pilot allocation pattern advocated, the weighted sum rate (WSR) is maximized by jointly optimizing the pilot power and payload power. To tackle the resultant NP-hard problem, the original optimization problem is first simplified by sophisticated mathematical transformations, and then approximations are found for transforming the original problems into a series of subproblems in geometric programming (GP) forms that can be readily solved. Simulation results demonstrate that the proposed pilot allocation strategy is capable of significantly increasing the number of admitted devices and the proposed power allocation achieves substantial WSR performance gain.
Abstract:Ultra-reliable and low-latency communication (URLLC) is a pivotal technique for enabling the wireless control over industrial Internet-of-Things (IIoT) devices. By deploying distributed access points (APs), cell-free massive multiple-input and multiple-output (CF mMIMO) has great potential to provide URLLC services for IIoT devices. In this paper, we investigate CF mMIMO-enabled URLLC in a smart factory. Lower bounds (LBs) of downlink ergodic data rate under finite channel blocklength (FCBL) with imperfect channel state information (CSI) are derived for maximum-ratio transmission (MRT), full-pilot zero-forcing (FZF), and local zero-forcing (LZF) precoding schemes. Meanwhile, the weighted sum rate is maximized by jointly optimizing the pilot power and transmission power based on the derived LBs. Specifically, we first provide the globally optimal solution of the pilot power, and then introduce some approximations to transform the original problems into a series of subproblems, which can be expressed in a geometric programming (GP) form that can be readily solved. Finally, an iterative algorithm is proposed to optimize the power allocation based on various precoding schemes. Simulation results demonstrate that the proposed algorithm is superior to the existing algorithms, and that the quality of URLLC services will benefit by deploying more APs, except for the FZF precoding scheme.
Abstract:Smart factories need to support the simultaneous communication of multiple industrial Internet-of-Things (IIoT) devices with ultra-reliability and low-latency communication (URLLC). Meanwhile, short packet transmission for IIoT applications incurs performance loss compared to traditional long packet transmission for human-to-human communications. On the other hand, cell-free massive multiple-input and multiple-output (CF mMIMO) technology can provide uniform services for all devices by deploying distributed access points (APs). In this paper, we adopt CF mMIMO to support URLLC in a smart factory. Specifically, we first derive the lower bound (LB) on achievable uplink data rate under the finite blocklength (FBL) with imperfect channel state information (CSI) for both maximum-ratio combining (MRC) and full-pilot zero-forcing (FZF) decoders. \textcolor{black}{The derived LB rates based on the MRC case have the same trends as the ergodic rate, while LB rates using the FZF decoder tightly match the ergodic rates}, which means that resource allocation can be performed based on the LB data rate rather the exact ergodic data rate under FBL. The \textcolor{black}{log-function method} and successive convex approximation (SCA) are then used to approximately transform the non-convex weighted sum rate problem into a series of geometric program (GP) problems, and an iterative algorithm is proposed to jointly optimize the pilot and payload power allocation. Simulation results demonstrate that CF mMIMO significantly improves the average weighted sum rate (AWSR) compared to centralized mMIMO. An interesting observation is that increasing the number of devices improves the AWSR for CF mMIMO whilst the AWSR remains relatively constant for centralized mMIMO.
Abstract:As a fundamental task for intelligent robots, visual SLAM has made great progress over the past decades. However, robust SLAM under highly weak-textured environments still remains very challenging. In this paper, we propose a novel visual SLAM system named RWT-SLAM to tackle this problem. We modify LoFTR network which is able to produce dense point matching under low-textured scenes to generate feature descriptors. To integrate the new features into the popular ORB-SLAM framework, we develop feature masks to filter out the unreliable features and employ KNN strategy to strengthen the matching robustness. We also retrained visual vocabulary upon new descriptors for efficient loop closing. The resulting RWT-SLAM is tested in various public datasets such as TUM and OpenLORIS, as well as our own data. The results shows very promising performance under highly weak-textured environments.