Abstract:Sharing virtual content among multiple smart glasses wearers is an essential feature of a seamless Collaborative Augmented Reality experience. To enable the sharing, local coordinate systems of the underlying 6D ego-pose trackers, running independently on each set of glasses, have to be spatially and temporally aligned with respect to each other. In this paper, we propose a novel lightweight solution for this problem, which is referred as ego-motion alignment. We show that detecting each other's face or glasses together with tracker ego-poses sufficiently conditions the problem to spatially relate local coordinate systems. Importantly, the detected glasses can serve as reliable anchors to bring sufficient accuracy for the targeted practical use. The proposed idea allows us to abandon the traditional visual localization step with fiducial markers or scene points as anchors. A novel closed form minimal solver which solves a Quadratic Eigenvalue Problem is derived and its refinement with Gaussian Belief Propagation is introduced. Experiments validate the presented approach and show its high practical potential.
Abstract:In this paper we deal with the initialization problem of a visual-inertial odometry system with rolling shutter cameras. The initialization is a prerequisite to utilize inertial signals and fuse them with the visual data. We propose a novel way to solve this problem on visual and inertial data simultaneously in a statistical sense, by casting it into the renormalization scheme of Kanatani. The renormalization is an optimization scheme which intends to reduce the inherent statistical bias of common linear systems. We derive and present necessary steps and methodology specific for the initialization problem. Extensive evaluations on perfect ground truth exhibit superior performance and up to 20% accuracy gain to the originally proposed Least Squares solution. The renormalization performs similarly to the optimal Maximum Likelihood estimate, despite arriving to the solution by different means. By this, we extend the set of common Computer Vision problems which can be cast into the renormalization scheme.
Abstract:In this paper, an efficient closed-form solution for the state initialization in visual-inertial odometry (VIO) and simultaneous localization and mapping (SLAM) is presented. Unlike the state-of-the-art, we do not derive linear equations from triangulating pairs of point observations. Instead, we build on a direct triangulation of the unknown $3D$ point paired with each of its observations. We show and validate the high impact of such a simple difference. The resulting linear system has a simpler structure and the solution through analytic elimination only requires solving a $6\times 6$ linear system (or $9 \times 9$ when accelerometer bias is included). In addition, all the observations of every scene point are jointly related, thereby leading to a less biased and more robust solution. The proposed formulation attains up to $50$ percent decreased velocity and point reconstruction error compared to the standard closed-form solver. Apart from the inherent efficiency, fewer iterations are needed by any further non-linear refinement thanks to better parameter initialization. In this context, we provide the analytic Jacobians for a non-linear optimizer that optionally refines the initial parameters. The superior performance of the proposed solver is established by quantitative comparisons with the state-of-the-art solver.