Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaolong Wu

NC-SDF: Enhancing Indoor Scene Reconstruction Using Neural SDFs with View-Dependent Normal Compensation

May 01, 2024

Ziyi Chen, Xiaolong Wu, Yu Zhang

Abstract:State-of-the-art neural implicit surface representations have achieved impressive results in indoor scene reconstruction by incorporating monocular geometric priors as additional supervision. However, we have observed that multi-view inconsistency between such priors poses a challenge for high-quality reconstructions. In response, we present NC-SDF, a neural signed distance field (SDF) 3D reconstruction framework with view-dependent normal compensation (NC). Specifically, we integrate view-dependent biases in monocular normal priors into the neural implicit representation of the scene. By adaptively learning and correcting the biases, our NC-SDF effectively mitigates the adverse impact of inconsistent supervision, enhancing both the global consistency and local details in the reconstructions. To further refine the details, we introduce an informative pixel sampling strategy to pay more attention to intricate geometry with higher information content. Additionally, we design a hybrid geometry modeling approach to improve the neural implicit representation. Experiments on synthetic and real-world datasets demonstrate that NC-SDF outperforms existing approaches in terms of reconstruction quality.

Via

Access Paper or Ask Questions

Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval

Jan 01, 2024

Weihang Su, Qingyao Ai, Xiangsheng Li, Jia Chen, Yiqun Liu, Xiaolong Wu, Shengluan Hou

Figure 1 for Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval

Figure 2 for Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval

Figure 3 for Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval

Figure 4 for Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval

Abstract:With the development of deep learning and natural language processing techniques, pre-trained language models have been widely used to solve information retrieval (IR) problems. Benefiting from the pre-training and fine-tuning paradigm, these models achieve state-of-the-art performance. In previous works, plain texts in Wikipedia have been widely used in the pre-training stage. However, the rich structured information in Wikipedia, such as the titles, abstracts, hierarchical heading (multi-level title) structure, relationship between articles, references, hyperlink structures, and the writing organizations, has not been fully explored. In this paper, we devise four pre-training objectives tailored for IR tasks based on the structured knowledge of Wikipedia. Compared to existing pre-training methods, our approach can better capture the semantic knowledge in the training corpus by leveraging the human-edited structured data from Wikipedia. Experimental results on multiple IR benchmark datasets show the superior performance of our model in both zero-shot and fine-tuning settings compared to existing strong retrieval baselines. Besides, experimental results in biomedical and legal domains demonstrate that our approach achieves better performance in vertical domains compared to previous models, especially in scenarios where long text similarity matching is needed.

* Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

Via

Access Paper or Ask Questions

SSP: Self-Supervised Post-training for Conversational Search

Jul 02, 2023

Quan Tu, Shen Gao, Xiaolong Wu, Zhao Cao, Ji-Rong Wen, Rui Yan

Abstract:Conversational search has been regarded as the next-generation search paradigm. Constrained by data scarcity, most existing methods distill the well-trained ad-hoc retriever to the conversational retriever. However, these methods, which usually initialize parameters by query reformulation to discover contextualized dependency, have trouble in understanding the dialogue structure information and struggle with contextual semantic vanishing. In this paper, we propose \fullmodel (\model) which is a new post-training paradigm with three self-supervised tasks to efficiently initialize the conversational search model to enhance the dialogue structure and contextual semantic understanding. Furthermore, the \model can be plugged into most of the existing conversational models to boost their performance. To verify the effectiveness of our proposed method, we apply the conversational encoder post-trained by \model on the conversational search task using two benchmark datasets: CAsT-19 and CAsT-20. Extensive experiments that our \model can boost the performance of several existing conversational search methods. Our source code is available at \url{https://github.com/morecry/SSP}.

* Accepted by ACL 2023 Findings, Long Paper

Via

Access Paper or Ask Questions

Building an Aerial-Ground Robotics System for Precision Farming

Nov 08, 2019

Alberto Pretto, Stéphanie Aravecchia, Wolfram Burgard, Nived Chebrolu, Christian Dornhege, Tillmann Falck, Freya Fleckenstein, Alessandra Fontenla, Marco Imperoli, Raghav Khanna(+18 more)

Figure 1 for Building an Aerial-Ground Robotics System for Precision Farming

Figure 2 for Building an Aerial-Ground Robotics System for Precision Farming

Figure 3 for Building an Aerial-Ground Robotics System for Precision Farming

Figure 4 for Building an Aerial-Ground Robotics System for Precision Farming

Abstract:The application of autonomous robots in agriculture is gaining more and more popularity thanks to the high impact it may have on food security, sustainability, resource use efficiency, reduction of chemical treatments, minimization of the human effort and maximization of yield. The Flourish research project faced this challenge by developing an adaptable robotic solution for precision farming that combines the aerial survey capabilities of small autonomous unmanned aerial vehicles (UAVs) with flexible targeted intervention performed by multi-purpose agricultural unmanned ground vehicles (UGVs). This paper presents an exhaustive overview of the scientific and technological advances and outcomes obtained in the Flourish project. We introduce multi-spectral perception algorithms and aerial and ground based systems developed to monitor crop density, weed pressure, crop nitrogen nutrition status, and to accurately classify and locate weeds. We then introduce the navigation and mapping systems to deal with the specificity of the employed robots and of the agricultural environment, highlighting the collaborative modules that enable the UAVs and UGVs to collect and share information in a unified environment model. We finally present the ground intervention hardware, software solutions, and interfaces we implemented and tested in different field conditions and with different crops. We describe here a real use case in which a UAV collaborates with a UGV to monitor the field and to perform selective spraying treatments in a totally autonomous way.

* Submitted to IEEE Robotics & Automation Magazine

Via

Access Paper or Ask Questions

Robust Semi-Direct Monocular Visual Odometry Using Edge and Illumination-Robust Cost

Sep 25, 2019

Xiaolong Wu, Cedric Pradalier

Figure 1 for Robust Semi-Direct Monocular Visual Odometry Using Edge and Illumination-Robust Cost

Figure 2 for Robust Semi-Direct Monocular Visual Odometry Using Edge and Illumination-Robust Cost

Figure 3 for Robust Semi-Direct Monocular Visual Odometry Using Edge and Illumination-Robust Cost

Figure 4 for Robust Semi-Direct Monocular Visual Odometry Using Edge and Illumination-Robust Cost

Abstract:In this work, we propose a monocular semi-direct visual odometry framework, which is capable of exploiting the best attributes of edge features and local photometric information for illumination-robust camera motion estimation and scene reconstruction. In the tracking layer, the edge alignment error and image gradient error are jointly optimized through a convergence-preserved reweighting strategy, which not only preserves the property of illumination invariance but also leads to larger convergence basin and higher tracking accuracy compared with individual approaches. In the mapping layer, a fast probabilistic 1D search strategy is proposed to locate the best photometrically matched point along all geometrically possible edges, which enables real-time edge point correspondence generation using merely high-frequency components of the image. The resultant reprojection error is then used to substitute edge alignment error for joint optimization in local bundle adjustment, avoiding the partial observability issue of monocular edge mapping as well as improving the stability of optimization. We present extensive analysis and evaluation of our proposed system on synthetic and real-world benchmark datasets under the influence of illumination changes and large camera motions, where our proposed system outperforms current state-of-art algorithms.

* 6 pages, 6 figures, 2 tables, submitted to icra2020

Via

Access Paper or Ask Questions

Semantic Nearest Neighbor Fields Monocular Edge Visual-Odometry

Apr 01, 2019

Xiaolong Wu, Assia Benbihi, Antoine Richard, Cedric Pradalier

Figure 1 for Semantic Nearest Neighbor Fields Monocular Edge Visual-Odometry

Figure 2 for Semantic Nearest Neighbor Fields Monocular Edge Visual-Odometry

Figure 3 for Semantic Nearest Neighbor Fields Monocular Edge Visual-Odometry

Figure 4 for Semantic Nearest Neighbor Fields Monocular Edge Visual-Odometry

Abstract:Recent advances in deep learning for edge detection and segmentation opens up a new path for semantic-edge-based ego-motion estimation. In this work, we propose a robust monocular visual odometry (VO) framework using category-aware semantic edges. It can reconstruct large-scale semantic maps in challenging outdoor environments. The core of our approach is a semantic nearest neighbor field that facilitates a robust data association of edges across frames using semantics. This significantly enlarges the convergence radius during tracking phases. The proposed edge registration method can be easily integrated into direct VO frameworks to estimate photometrically, geometrically, and semantically consistent camera motions. Different types of edges are evaluated and extensive experiments demonstrate that our proposed system outperforms state-of-art indirect, direct, and semantic monocular VO systems.

Via

Access Paper or Ask Questions