Abstract:Existing vision-based 3D occupancy prediction methods are inherently limited in accuracy due to their exclusive reliance on street-view imagery, neglecting the potential benefits of incorporating satellite views. We propose SA-Occ, the first Satellite-Assisted 3D occupancy prediction model, which leverages GPS & IMU to integrate historical yet readily available satellite imagery into real-time applications, effectively mitigating limitations of ego-vehicle perceptions, involving occlusions and degraded performance in distant regions. To address the core challenges of cross-view perception, we propose: 1) Dynamic-Decoupling Fusion, which resolves inconsistencies in dynamic regions caused by the temporal asynchrony between satellite and street views; 2) 3D-Proj Guidance, a module that enhances 3D feature extraction from inherently 2D satellite imagery; and 3) Uniform Sampling Alignment, which aligns the sampling density between street and satellite views. Evaluated on Occ3D-nuScenes, SA-Occ achieves state-of-the-art performance, especially among single-frame methods, with a 39.05% mIoU (a 6.97% improvement), while incurring only 6.93 ms of additional latency per frame. Our code and newly curated dataset are available at https://github.com/chenchen235/SA-Occ.
Abstract:Semantic segmentation and 3D reconstruction are two fundamental tasks in remote sensing, typically treated as separate or loosely coupled tasks. Despite attempts to integrate them into a unified network, the constraints between the two heterogeneous tasks are not explicitly modeled, since the pioneering studies either utilize a loosely coupled parallel structure or engage in only implicit interactions, failing to capture the inherent connections. In this work, we explore the connections between the two tasks and propose a new network that imposes semantic constraints on the stereo matching task, both implicitly and explicitly. Implicitly, we transform the traditional parallel structure to a new cascade structure termed Semantic-Guided Cascade structure, where the deep features enriched with semantic information are utilized for the computation of initial disparity maps, enhancing semantic guidance. Explicitly, we propose a Semantic Selective Refinement (SSR) module and a Left-Right Semantic Consistency (LRSC) module. The SSR refines the initial disparity map under the guidance of the semantic map. The LRSC ensures semantic consistency between two views via reducing the semantic divergence after transforming the semantic map from one view to the other using the disparity map. Experiments on the US3D and WHU datasets demonstrate that our method achieves state-of-the-art performance for both semantic segmentation and stereo matching.
Abstract:In the research of next-generation wireless communication technologies, orthogonal time frequency space (OTFS) modulation is emerging as a promising technique for high-speed mobile environments due to its superior efficiency and robustness in doubly selective channels. Additionally, the cell-free architecture, which eliminates the issues associated with cell boundaries, offers broader coverage for radio access networks. By combining cell-free network architecture with OTFS modulation, the system may meet the demands of massive random access required by machine-type communication devices in high-speed scenarios. This paper explores a massive random access scheme based on OTFS modulation within a cell-free architecture. A transceiver model for uplink OTFS signals involving multiple access points (APs) is developed, where channel estimation with fractional channel parameters is approximated as a block sparse matrix recovery problem. Building on existing superimposed and embedded preamble schemes, a hybrid preamble scheme is proposed. This scheme leverages superimposed and embedded preambles to respectively achieve rough and accurate active user equipment (UEs) detection (AUD), as well as precise channel estimation, under the condition of supporting a large number of access UEs. Moreover, this study introduces a generalized approximate message passing and pattern coupling sparse Bayesian learning with Laplacian prior (GAMP-PCSBL-La) algorithm, which effectively captures block sparse features after discrete cosine transform (DCT), delivering precise estimation results with reduced computational complexity. Simulation results demonstrate that the proposed scheme is effective and provides superior performance compared to other existing schemes.