Abstract:3D Gaussian Splatting (3DGS) techniques have achieved satisfactory 3D scene representation. Despite their impressive performance, they confront challenges due to the limitation of structure-from-motion (SfM) methods on acquiring accurate scene initialization, or the inefficiency of densification strategy. In this paper, we introduce a novel framework EasySplat to achieve high-quality 3DGS modeling. Instead of using SfM for scene initialization, we employ a novel method to release the power of large-scale pointmap approaches. Specifically, we propose an efficient grouping strategy based on view similarity, and use robust pointmap priors to obtain high-quality point clouds and camera poses for 3D scene initialization. After obtaining a reliable scene structure, we propose a novel densification approach that adaptively splits Gaussian primitives based on the average shape of neighboring Gaussian ellipsoids, utilizing KNN scheme. In this way, the proposed method tackles the limitation on initialization and optimization, leading to an efficient and accurate 3DGS modeling. Extensive experiments demonstrate that EasySplat outperforms the current state-of-the-art (SOTA) in handling novel view synthesis.
Abstract:3D semantic maps have played an increasingly important role in high-precision robot localization and scene understanding. However, real-time construction of semantic maps requires mobile edge devices with extremely high computing power, which are expensive and limit the widespread application of semantic mapping. In order to address this limitation, inspired by cloud-edge collaborative computing and the high transmission efficiency of semantic communication, this paper proposes a method to achieve real-time semantic mapping tasks with limited-resource mobile devices. Specifically, we design an encoding-decoding semantic communication framework for real-time semantic mapping tasks under limited-resource situations. In addition, considering the impact of different channel conditions on communication, this paper designs a module based on the attention mechanism to achieve stable data transmission under various channel conditions. In terms of simulation experiments, based on the TUM dataset, it was verified that the system has an error of less than 0.1% compared to the groundtruth in mapping and localization accuracy and is superior to some novel semantic communication algorithms in real-time performance and channel adaptation. Besides, we implement a prototype system to verify the effectiveness of the proposed framework and designed module in real indoor scenarios. The results show that our system can complete real-time semantic mapping tasks for common indoor objects (chairs, computers, people, etc.) with a limited-resource device, and the mapping update time is less than 1 second.
Abstract:The autonomous exploration of environments by multi-robot systems is a critical task with broad applications in rescue missions, exploration endeavors, and beyond. Current approaches often rely on either greedy frontier selection or end-to-end deep reinforcement learning (DRL) methods, yet these methods are frequently hampered by limitations such as short-sightedness, overlooking long-term implications, and convergence difficulties stemming from the intricate high-dimensional learning space. To address these challenges, this paper introduces an innovative integration strategy that combines the low-dimensional action space efficiency of frontier-based methods with the far-sightedness and optimality of DRL-based approaches. We propose a three-tiered planning framework that first identifies frontiers in free space, creating a sparse map representation that lightens data transmission burdens and reduces the DRL action space's dimensionality. Subsequently, we develop a multi-graph neural network (mGNN) that incorporates states of potential targets and robots, leveraging policy-based reinforcement learning to compute affinities, thereby superseding traditional heuristic utility values. Lastly, we implement local routing planning through subsequence search, which avoids exhaustive sequence traversal. Extensive validation across diverse scenarios and comprehensive simulation results demonstrate the effectiveness of our proposed method. Compared to baseline approaches, our framework achieves environmental exploration with fewer time steps and a notable reduction of over 30% in data transmission, showcasing its superiority in terms of efficiency and performance.