Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Haochen Jiang

DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization

Nov 13, 2024

Yueming Xu, Haochen Jiang, Zhongyang Xiao, Jianfeng Feng, Li Zhang

Figure 1 for DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization

Figure 2 for DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization

Figure 3 for DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization

Figure 4 for DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization

Abstract:Achieving robust and precise pose estimation in dynamic scenes is a significant research challenge in Visual Simultaneous Localization and Mapping (SLAM). Recent advancements integrating Gaussian Splatting into SLAM systems have proven effective in creating high-quality renderings using explicit 3D Gaussian models, significantly improving environmental reconstruction fidelity. However, these approaches depend on a static environment assumption and face challenges in dynamic environments due to inconsistent observations of geometry and photometry. To address this problem, we propose DG-SLAM, the first robust dynamic visual SLAM system grounded in 3D Gaussians, which provides precise camera pose estimation alongside high-fidelity reconstructions. Specifically, we propose effective strategies, including motion mask generation, adaptive Gaussian point management, and a hybrid camera tracking algorithm to improve the accuracy and robustness of pose estimation. Extensive experiments demonstrate that DG-SLAM delivers state-of-the-art performance in camera pose estimation, map reconstruction, and novel-view synthesis in dynamic scenes, outperforming existing methods meanwhile preserving real-time rendering ability.

Via

Access Paper or Ask Questions

RoDyn-SLAM: Robust Dynamic Dense RGB-D SLAM with Neural Radiance Fields

Jul 01, 2024

Haochen Jiang, Yueming Xu, Kejie Li, Jianfeng Feng, Li Zhang

Abstract:Leveraging neural implicit representation to conduct dense RGB-D SLAM has been studied in recent years. However, this approach relies on a static environment assumption and does not work robustly within a dynamic environment due to the inconsistent observation of geometry and photometry. To address the challenges presented in dynamic environments, we propose a novel dynamic SLAM framework with neural radiance field. Specifically, we introduce a motion mask generation method to filter out the invalid sampled rays. This design effectively fuses the optical flow mask and semantic mask to enhance the precision of motion mask. To further improve the accuracy of pose estimation, we have designed a divide-and-conquer pose optimization algorithm that distinguishes between keyframes and non-keyframes. The proposed edge warp loss can effectively enhance the geometry constraints between adjacent frames. Extensive experiments are conducted on the two challenging datasets, and the results show that RoDyn-SLAM achieves state-of-the-art performance among recent neural RGB-D methods in both accuracy and robustness.

* IEEE RAL 2024

Via

Access Paper or Ask Questions

OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation

Mar 18, 2024

Haochen Jiang, Yueming Xu, Yihan Zeng, Hang Xu, Wei Zhang, Jianfeng Feng, Li Zhang

Abstract:3D reconstruction has been widely used in autonomous navigation fields of mobile robotics. However, the former research can only provide the basic geometry structure without the capability of open-world scene understanding, limiting advanced tasks like human interaction and visual navigation. Moreover, traditional 3D scene understanding approaches rely on expensive labeled 3D datasets to train a model for a single task with supervision. Thus, geometric reconstruction with zero-shot scene understanding i.e. Open vocabulary 3D Understanding and Reconstruction, is crucial for the future development of mobile robots. In this paper, we propose OpenOcc, a novel framework unifying the 3D scene reconstruction and open vocabulary understanding with neural radiance fields. We model the geometric structure of the scene with occupancy representation and distill the pre-trained open vocabulary model into a 3D language field via volume rendering for zero-shot inference. Furthermore, a novel semantic-aware confidence propagation (SCP) method has been proposed to relieve the issue of language field representation degeneracy caused by inconsistent measurements in distilled features. Experimental results show that our approach achieves competitive performance in 3D scene understanding tasks, especially for small and long-tail objects.

Via

Access Paper or Ask Questions

NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts

May 25, 2023

Yue Zhang, Bo Zhang, Haochen Jiang, Zhenghua Li, Chen Li, Fei Huang, Min Zhang

Figure 1 for NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts

Figure 2 for NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts

Figure 3 for NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts

Figure 4 for NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts

Abstract:We introduce NaSGEC, a new dataset to facilitate research on Chinese grammatical error correction (CGEC) for native speaker texts from multiple domains. Previous CGEC research primarily focuses on correcting texts from a single domain, especially learner essays. To broaden the target domain, we annotate multiple references for 12,500 sentences from three native domains, i.e., social media, scientific writing, and examination. We provide solid benchmark results for NaSGEC by employing cutting-edge CGEC models and different training data. We further perform detailed analyses of the connections and gaps between our domains from both empirical and statistical views. We hope this work can inspire future studies on an important but under-explored direction--cross-domain GEC.

* Accepted by ACL 2023 (Findings, long paper)

Via

Access Paper or Ask Questions

Mining Error Templates for Grammatical Error Correction

Jun 23, 2022

Yue Zhang, Haochen Jiang, Zuyi Bao, Bo Zhang, Chen Li, Zhenghua Li

Figure 1 for Mining Error Templates for Grammatical Error Correction

Figure 2 for Mining Error Templates for Grammatical Error Correction

Figure 3 for Mining Error Templates for Grammatical Error Correction

Figure 4 for Mining Error Templates for Grammatical Error Correction

Abstract:Some grammatical error correction (GEC) systems incorporate hand-crafted rules and achieve positive results. However, manually defining rules is time-consuming and laborious. In view of this, we propose a method to mine error templates for GEC automatically. An error template is a regular expression aiming at identifying text errors. We use the web crawler to acquire such error templates from the Internet. For each template, we further select the corresponding corrective action by using the language model perplexity as a criterion. We have accumulated 1,119 error templates for Chinese GEC based on this method. Experimental results on the newly proposed CTC-2021 Chinese GEC benchmark show that combing our error templates can effectively improve the performance of a strong GEC system, especially on two error types with very little training data. Our error templates are available at \url{https://github.com/HillZhang1999/gec_error_template}.

Via

Access Paper or Ask Questions