Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qianliang Wu

Diff-Reg v2: Diffusion-Based Matching Matrix Estimation for Image Matching and 3D Registration

Mar 07, 2025

Qianliang Wu, Haobo Jiang, Yaqing Ding, Lei Luo, Jin Xie, Jian Yang

Figure 1 for Diff-Reg v2: Diffusion-Based Matching Matrix Estimation for Image Matching and 3D Registration

Figure 2 for Diff-Reg v2: Diffusion-Based Matching Matrix Estimation for Image Matching and 3D Registration

Figure 3 for Diff-Reg v2: Diffusion-Based Matching Matrix Estimation for Image Matching and 3D Registration

Figure 4 for Diff-Reg v2: Diffusion-Based Matching Matrix Estimation for Image Matching and 3D Registration

Abstract:Establishing reliable correspondences is crucial for all registration tasks, including 2D image registration, 3D point cloud registration, and 2D-3D image-to-point cloud registration. However, these tasks are often complicated by challenges such as scale inconsistencies, symmetry, and large deformations, which can lead to ambiguous matches. Previous feature-based and correspondence-based methods typically rely on geometric or semantic features to generate or polish initial potential correspondences. Some methods typically leverage specific geometric priors, such as topological preservation, to devise diverse and innovative strategies tailored to a given enhancement goal, which cannot be exhaustively enumerated. Additionally, many previous approaches rely on a single-step prediction head, which can struggle with local minima in complex matching scenarios. To address these challenges, we introduce an innovative paradigm that leverages a diffusion model in matrix space for robust matching matrix estimation. Our model treats correspondence estimation as a denoising diffusion process in the matching matrix space, gradually refining the intermediate matching matrix to the optimal one. Specifically, we apply the diffusion model in the doubly stochastic matrix space for 3D-3D and 2D-3D registration tasks. In the 2D image registration task, we deploy the diffusion model in a matrix subspace where dual-softmax projection regularization is applied. For all three registration tasks, we provide adaptive matching matrix embedding implementations tailored to the specific characteristics of each task while maintaining a consistent "match-to-warp" encoding pattern. Furthermore, we adopt a lightweight design for the denoising module. In inference, once points or image features are extracted and fixed, this module performs multi-step denoising predictions through reverse sampling.

* arXiv admin note: text overlap with arXiv:2403.19919

Via

Access Paper or Ask Questions

Three-view Focal Length Recovery From Homographies

Jan 13, 2025

Yaqing Ding, Viktor Kocur, Zuzana Berger Haladová, Qianliang Wu, Shen Cai, Jian Yang, Zuzana Kukelova

Figure 1 for Three-view Focal Length Recovery From Homographies

Figure 2 for Three-view Focal Length Recovery From Homographies

Figure 3 for Three-view Focal Length Recovery From Homographies

Figure 4 for Three-view Focal Length Recovery From Homographies

Abstract:In this paper, we propose a novel approach for recovering focal lengths from three-view homographies. By examining the consistency of normal vectors between two homographies, we derive new explicit constraints between the focal lengths and homographies using an elimination technique. We demonstrate that three-view homographies provide two additional constraints, enabling the recovery of one or two focal lengths. We discuss four possible cases, including three cameras having an unknown equal focal length, three cameras having two different unknown focal lengths, three cameras where one focal length is known, and the other two cameras have equal or different unknown focal lengths. All the problems can be converted into solving polynomials in one or two unknowns, which can be efficiently solved using Sturm sequence or hidden variable technique. Evaluation using both synthetic and real data shows that the proposed solvers are both faster and more accurate than methods relying on existing two-view solvers. The code and data are available on https://github.com/kocurvik/hf

* Code available at https://github.com/kocurvik/hf Dataset available at: https://doi.org/10.5281/zenodo.14638904

Via

Access Paper or Ask Questions

Diff-Reg v1: Diffusion Matching Model for Registration Problem

Mar 29, 2024

Qianliang Wu, Haobo Jiang, Lei Luo, Jun Li, Yaqing Ding, Jin Xie, Jian Yang

Figure 1 for Diff-Reg v1: Diffusion Matching Model for Registration Problem

Figure 2 for Diff-Reg v1: Diffusion Matching Model for Registration Problem

Figure 3 for Diff-Reg v1: Diffusion Matching Model for Registration Problem

Figure 4 for Diff-Reg v1: Diffusion Matching Model for Registration Problem

Abstract:Establishing reliable correspondences is essential for registration tasks such as 3D and 2D3D registration. Existing methods commonly leverage geometric or semantic point features to generate potential correspondences. However, these features may face challenges such as large deformation, scale inconsistency, and ambiguous matching problems (e.g., symmetry). Additionally, many previous methods, which rely on single-pass prediction, may struggle with local minima in complex scenarios. To mitigate these challenges, we introduce a diffusion matching model for robust correspondence construction. Our approach treats correspondence estimation as a denoising diffusion process within the doubly stochastic matrix space, which gradually denoises (refines) a doubly stochastic matching matrix to the ground-truth one for high-quality correspondence estimation. It involves a forward diffusion process that gradually introduces Gaussian noise into the ground truth matching matrix and a reverse denoising process that iteratively refines the noisy matching matrix. In particular, the feature extraction from the backbone occurs only once during the inference phase. Our lightweight denoising module utilizes the same feature at each reverse sampling step. Evaluation of our method on both 3D and 2D3D registration tasks confirms its effectiveness.

* arXiv admin note: text overlap with arXiv:2401.00436

Via

Access Paper or Ask Questions

Diff-PCR: Diffusion-Based Correspondence Searching in Doubly Stochastic Matrix Space for Point Cloud Registration

Jan 17, 2024

Qianliang Wu, Haobo Jiang, Yaqing Ding, Lei Luo, Jin Xie, Jian Yang

Figure 1 for Diff-PCR: Diffusion-Based Correspondence Searching in Doubly Stochastic Matrix Space for Point Cloud Registration

Figure 2 for Diff-PCR: Diffusion-Based Correspondence Searching in Doubly Stochastic Matrix Space for Point Cloud Registration

Figure 3 for Diff-PCR: Diffusion-Based Correspondence Searching in Doubly Stochastic Matrix Space for Point Cloud Registration

Figure 4 for Diff-PCR: Diffusion-Based Correspondence Searching in Doubly Stochastic Matrix Space for Point Cloud Registration

Abstract:Efficiently finding optimal correspondences between point clouds is crucial for solving both rigid and non-rigid point cloud registration problems. Existing methods often rely on geometric or semantic feature embedding to establish correspondences and estimate transformations or flow fields. Recently, state-of-the-art methods have employed RAFT-like iterative updates to refine the solution. However, these methods have certain limitations. Firstly, their iterative refinement design lacks transparency, and their iterative updates follow a fixed path during the refinement process, which can lead to suboptimal results. Secondly, these methods overlook the importance of refining or optimizing correspondences (or matching matrices) as a precursor to solving transformations or flow fields. They typically compute candidate correspondences based on distances in the point feature space. However, they only project the candidate matching matrix into some matrix space once with Sinkhorn or dual softmax operations to obtain final correspondences. This one-shot projected matching matrix may be far from the globally optimal one, and these approaches do not consider the distribution of the target matching matrix. In this paper, we propose a novel approach that exploits the Denoising Diffusion Model to predict a searching gradient for the optimal matching matrix within the Doubly Stochastic Matrix Space. During the reverse denoising process, our method iteratively searches for better solutions along this denoising gradient, which points towards the maximum likelihood direction of the target matching matrix. Our method offers flexibility by allowing the search to start from any initial matching matrix provided by the online backbone or white noise. Experimental evaluations on the 3DMatch/3DLoMatch and 4DMatch/4DLoMatch datasets demonstrate the effectiveness of our newly designed framework.

Via

Access Paper or Ask Questions

SGFeat: Salient Geometric Feature for Point Cloud Registration

Sep 12, 2023

Qianliang Wu, Yaqing Ding, Lei Luo, Chuanwei Zhou, Jin Xie, Jian Yang

Abstract:Point Cloud Registration (PCR) is a critical and challenging task in computer vision. One of the primary difficulties in PCR is identifying salient and meaningful points that exhibit consistent semantic and geometric properties across different scans. Previous methods have encountered challenges with ambiguous matching due to the similarity among patch blocks throughout the entire point cloud and the lack of consideration for efficient global geometric consistency. To address these issues, we propose a new framework that includes several novel techniques. Firstly, we introduce a semantic-aware geometric encoder that combines object-level and patch-level semantic information. This encoder significantly improves registration recall by reducing ambiguity in patch-level superpoint matching. Additionally, we incorporate a prior knowledge approach that utilizes an intrinsic shape signature to identify salient points. This enables us to extract the most salient super points and meaningful dense points in the scene. Secondly, we introduce an innovative transformer that encodes High-Order (HO) geometric features. These features are crucial for identifying salient points within initial overlap regions while considering global high-order geometric consistency. To optimize this high-order transformer further, we introduce an anchor node selection strategy. By encoding inter-frame triangle or polyhedron consistency features based on these anchor nodes, we can effectively learn high-order geometric features of salient super points. These high-order features are then propagated to dense points and utilized by a Sinkhorn matching module to identify key correspondences for successful registration. In our experiments conducted on well-known datasets such as 3DMatch/3DLoMatch and KITTI, our approach has shown promising results, highlighting the effectiveness of our novel method.

Via

Access Paper or Ask Questions

Large-scale Point Cloud Registration Based on Graph Matching Optimization

Feb 16, 2023

Qianliang Wu, Yaqi Shen, Guofeng Mei, Yaqing Ding, Lei Luo, Jin Xie, Jian Yang

Figure 1 for Large-scale Point Cloud Registration Based on Graph Matching Optimization

Figure 2 for Large-scale Point Cloud Registration Based on Graph Matching Optimization

Figure 3 for Large-scale Point Cloud Registration Based on Graph Matching Optimization

Figure 4 for Large-scale Point Cloud Registration Based on Graph Matching Optimization

Abstract:Point Clouds Registration is a fundamental and challenging problem in 3D computer vision. It has been shown that the isometric transformation is an essential property in rigid point cloud registration, but the existing methods only utilize it in the outlier rejection stage. In this paper, we emphasize that the isometric transformation is also important in the feature learning stage for improving registration quality. We propose a \underline{G}raph \underline{M}atching \underline{O}ptimization based \underline{Net}work (denoted as GMONet for short), which utilizes the graph matching method to explicitly exert the isometry preserving constraints in the point feature learning stage to improve %refine the point representation. Specifically, we %use exploit the partial graph matching constraint to enhance the overlap region detection abilities of super points ($i.e.,$ down-sampled key points) and full graph matching to refine the registration accuracy at the fine-level overlap region. Meanwhile, we leverage the mini-batch sampling to improve the efficiency of the full graph matching optimization. Given high discriminative point features in the evaluation stage, we utilize the RANSAC approach to estimate the transformation between the scanned pairs. The proposed method has been evaluated on the 3DMatch/3DLoMatch benchmarks and the KITTI benchmark. The experimental results show that our method achieves competitive performance compared with the existing state-of-the-art baselines.

Via

Access Paper or Ask Questions

Interest-Behaviour Multiplicative Network for Resource-limited Recommendation

Oct 10, 2020

Qianliang Wu, Tong Zhang, Zhen Cui, Jian Yang

Figure 1 for Interest-Behaviour Multiplicative Network for Resource-limited Recommendation

Figure 2 for Interest-Behaviour Multiplicative Network for Resource-limited Recommendation

Figure 3 for Interest-Behaviour Multiplicative Network for Resource-limited Recommendation

Figure 4 for Interest-Behaviour Multiplicative Network for Resource-limited Recommendation

Abstract:Resource constraints, e.g. limited product inventory or product categories, may affect consumers' choices or preferences in some recommendation tasks, but are usually ignored in previous recommendation methods. In this paper, we aim to mine the cue of user preferences in resource-limited recommendation tasks, for which purpose we specifically build a largely used car transaction dataset possessing resource-limitation characteristics. Accordingly, we propose an interest-behaviour multiplicative network to predict the user's future interaction based on dynamic connections between users and items. To describe the user-item connection dynamically, mutually-recursive recurrent neural networks (MRRNNs) are introduced to capture interactive long-term dependencies, and meantime effective representations of users and items are obtained. To further take the resource limitation into consideration, a resource-limited branch is built to specifically explore the influence of resource variation caused by user behaviour for user preferences. Finally, mutual information is introduced to measure the similarity between the user action and fused features to predict future interaction, where the fused features come from both MRRNNs and resource-limited branches. We test the performance on the built used car transaction dataset as well as the Tmall dataset, and the experimental results verify the effectiveness of our framework.

Via

Access Paper or Ask Questions