Abstract:Visual-Inertial odometry (VIO) is known to suffer from drifting especially over long-term runs. In this paper, we present GVINS, a non-linear optimization based system that tightly fuses GNSS raw measurements with visual and inertial information for real-time and drift-free state estimation. Our system is aiming to provide accurate global 6-DoF estimation under complex indoor-outdoor environment where GNSS signals may be largely intercepted or even totally unavailable. To connect global measurements with local states, a coarse-to-fine initialization procedure is proposed to efficiently online calibrate the transformation and initialize GNSS states from only a short window of measurements. The GNSS pseudorange and Doppler shift measurements are then modelled and optimized under a factor graph framework along with visual and inertial constraints. For complex and GNSS-unfriendly areas, the degenerate cases are discussed and carefully handled to ensure robustness. The engineering challenges involved in the system are also included to facilitate relevant GNSS fusion researches. Thanks to the tightly-coupled multi-sensor approach and system design, our system fully exploits the merits of three types of sensors and is capable to seamlessly cope with the transition between indoor and outdoor environments, where satellites are lost and recaptured again. We extensively evaluate the proposed system by both simulation and real-world experiments, and the result demonstrates that our system substantially eliminates the drift of VIO and preserves the local accuracy in spite of noisy GNSS measurements. In addition, experiments also show that our system can gain from even a single satellite while conventional GNSS algorithms need four at lease.
Abstract:Accurate state estimation is a fundamental problem for autonomous robots. To achieve locally accurate and globally drift-free state estimation, multiple sensors with complementary properties are usually fused together. Local sensors (camera, IMU, LiDAR, etc) provide precise pose within a small region, while global sensors (GPS, magnetometer, barometer, etc) supply noisy but globally drift-free localization in a large-scale environment. In this paper, we propose a sensor fusion framework to fuse local states with global sensors, which achieves locally accurate and globally drift-free pose estimation. Local estimations, produced by existing VO/VIO approaches, are fused with global sensors in a pose graph optimization. Within the graph optimization, local estimations are aligned into a global coordinate. Meanwhile, the accumulated drifts are eliminated. We evaluate the performance of our system on public datasets and with real-world experiments. Results are compared against other state-of-the-art algorithms. We highlight that our system is a general framework, which can easily fuse various global sensors in a unified pose graph optimization. Our implementations are open source\footnote{https://github.com/HKUST-Aerial-Robotics/VINS-Fusion}.
Abstract:Nowadays, more and more sensors are equipped on robots to increase robustness and autonomous ability. We have seen various sensor suites equipped on different platforms, such as stereo cameras on ground vehicles, a monocular camera with an IMU (Inertial Measurement Unit) on mobile phones, and stereo cameras with an IMU on aerial robots. Although many algorithms for state estimation have been proposed in the past, they are usually applied to a single sensor or a specific sensor suite. Few of them can be employed with multiple sensor choices. In this paper, we proposed a general optimization-based framework for odometry estimation, which supports multiple sensor sets. Every sensor is treated as a general factor in our framework. Factors which share common state variables are summed together to build the optimization problem. We further demonstrate the generality with visual and inertial sensors, which form three sensor suites (stereo cameras, a monocular camera with an IMU, and stereo cameras with an IMU). We validate the performance of our system on public datasets and through real-world experiments with multiple sensors. Results are compared against other state-of-the-art algorithms. We highlight that our system is a general framework, which can easily fuse various sensors in a pose graph optimization. Our implementations are open source\footnote{https://github.com/HKUST-Aerial-Robotics/VINS-Fusion}.
Abstract:Recurrent neural networks (RNNs) have been successfully applied to various natural language processing (NLP) tasks and achieved better results than conventional methods. However, the lack of understanding of the mechanisms behind their effectiveness limits further improvements on their architectures. In this paper, we present a visual analytics method for understanding and comparing RNN models for NLP tasks. We propose a technique to explain the function of individual hidden state units based on their expected response to input texts. We then co-cluster hidden state units and words based on the expected response and visualize co-clustering results as memory chips and word clouds to provide more structured knowledge on RNNs' hidden states. We also propose a glyph-based sequence visualization based on aggregate information to analyze the behavior of an RNN's hidden state at the sentence-level. The usability and effectiveness of our method are demonstrated through case studies and reviews from domain experts.