https://youtu.be/TG2GBrJTuW4. The source code will be public at https://github.com/kang-1-2-3/CoFiI2P.
Image-to-point cloud (I2P) registration is a fundamental task in the fields of robot navigation and mobile mapping. Existing I2P registration works estimate correspondences at the point-to-pixel level, neglecting the global alignment. However, I2P matching without high-level guidance from global constraints may converge to the local optimum easily. To solve the problem, this paper proposes CoFiI2P, a novel I2P registration network that extracts correspondences in a coarse-to-fine manner for the global optimal solution. First, the image and point cloud are fed into a Siamese encoder-decoder network for hierarchical feature extraction. Then, a coarse-to-fine matching module is designed to exploit features and establish resilient feature correspondences. Specifically, in the coarse matching block, a novel I2P transformer module is employed to capture the homogeneous and heterogeneous global information from image and point cloud. With the discriminate descriptors, coarse super-point-to-super-pixel matching pairs are estimated. In the fine matching module, point-to-pixel pairs are established with the super-point-to-super-pixel correspondence supervision. Finally, based on matching pairs, the transform matrix is estimated with the EPnP-RANSAC algorithm. Extensive experiments conducted on the KITTI dataset have demonstrated that CoFiI2P achieves a relative rotation error (RRE) of 2.25 degrees and a relative translation error (RTE) of 0.61 meters. These results represent a significant improvement of 14% in RRE and 52% in RTE compared to the current state-of-the-art (SOTA) method. The demo video for the experiments is available at