Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ewelina Rupnik

Multi-view dense image matching with similarity learning and geometry priors

May 16, 2025

Mohamed Ali Chebbi, Ewelina Rupnik, Paul Lopes, Marc Pierrot-Deseilligny

Figure 1 for Multi-view dense image matching with similarity learning and geometry priors

Figure 2 for Multi-view dense image matching with similarity learning and geometry priors

Figure 3 for Multi-view dense image matching with similarity learning and geometry priors

Figure 4 for Multi-view dense image matching with similarity learning and geometry priors

Abstract:We introduce MV-DeepSimNets, a comprehensive suite of deep neural networks designed for multi-view similarity learning, leveraging epipolar geometry for training. Our approach incorporates an online geometry prior to characterize pixel relationships, either along the epipolar line or through homography rectification. This enables the generation of geometry-aware features from native images, which are then projected across candidate depth hypotheses using plane sweeping. Our method geometric preconditioning effectively adapts epipolar-based features for enhanced multi-view reconstruction, without requiring the laborious multi-view training dataset creation. By aggregating learned similarities, we construct and regularize the cost volume, leading to improved multi-view surface reconstruction over traditional dense matching approaches. MV-DeepSimNets demonstrates superior performance against leading similarity learning networks and end-to-end regression models, especially in terms of generalization capabilities across both aerial and satellite imagery with varied ground sampling distances. Our pipeline is integrated into MicMac software and can be readily adopted in standard multi-resolution image matching pipelines.

Via

Access Paper or Ask Questions

BRDF-NeRF: Neural Radiance Fields with Optical Satellite Images and BRDF Modelling

Sep 18, 2024

Lulin Zhang, Ewelina Rupnik, Tri Dung Nguyen, Stéphane Jacquemoud, Yann Klinger

Abstract:Understanding the anisotropic reflectance of complex Earth surfaces from satellite imagery is crucial for numerous applications. Neural radiance fields (NeRF) have become popular as a machine learning technique capable of deducing the bidirectional reflectance distribution function (BRDF) of a scene from multiple images. However, prior research has largely concentrated on applying NeRF to close-range imagery, estimating basic Microfacet BRDF models, which fall short for many Earth surfaces. Moreover, high-quality NeRFs generally require several images captured simultaneously, a rare occurrence in satellite imaging. To address these limitations, we propose BRDF-NeRF, developed to explicitly estimate the Rahman-Pinty-Verstraete (RPV) model, a semi-empirical BRDF model commonly employed in remote sensing. We assess our approach using two datasets: (1) Djibouti, captured in a single epoch at varying viewing angles with a fixed Sun position, and (2) Lanzhou, captured over multiple epochs with different viewing angles and Sun positions. Our results, based on only three to four satellite images for training, demonstrate that BRDF-NeRF can effectively synthesize novel views from directions far removed from the training data and produce high-quality digital surface models (DSMs).

Via

Access Paper or Ask Questions

An evaluation of Deep Learning based stereo dense matching dataset shift from aerial images and a large scale stereo dataset

Feb 19, 2024

Teng Wu, Bruno Vallet, Marc Pierrot-Deseilligny, Ewelina Rupnik

Figure 1 for An evaluation of Deep Learning based stereo dense matching dataset shift from aerial images and a large scale stereo dataset

Figure 2 for An evaluation of Deep Learning based stereo dense matching dataset shift from aerial images and a large scale stereo dataset

Figure 3 for An evaluation of Deep Learning based stereo dense matching dataset shift from aerial images and a large scale stereo dataset

Figure 4 for An evaluation of Deep Learning based stereo dense matching dataset shift from aerial images and a large scale stereo dataset

Abstract:Dense matching is crucial for 3D scene reconstruction since it enables the recovery of scene 3D geometry from image acquisition. Deep Learning (DL)-based methods have shown effectiveness in the special case of epipolar stereo disparity estimation in the computer vision community. DL-based methods depend heavily on the quality and quantity of training datasets. However, generating ground-truth disparity maps for real scenes remains a challenging task in the photogrammetry community. To address this challenge, we propose a method for generating ground-truth disparity maps directly from Light Detection and Ranging (LiDAR) and images to produce a large and diverse dataset for six aerial datasets across four different areas and two areas with different resolution images. We also introduce a LiDAR-to-image co-registration refinement to the framework that takes special precautions regarding occlusions and refrains from disparity interpolation to avoid precision loss. Evaluating 11 dense matching methods across datasets with diverse scene types, image resolutions, and geometric configurations, which are deeply investigated in dataset shift, GANet performs best with identical training and testing data, and PSMNet shows robustness across different datasets, and we proposed the best strategy for training with a limit dataset. We will also provide the dataset and training models; more information can be found at https://github.com/whuwuteng/Aerial_Stereo_Dataset.

* International Journal of Applied Earth Observation and Geoinformation, 128(2024)

Via

Access Paper or Ask Questions

SparseSat-NeRF: Dense Depth Supervised Neural Radiance Fields for Sparse Satellite Images

Sep 01, 2023

Lulin Zhang, Ewelina Rupnik

Figure 1 for SparseSat-NeRF: Dense Depth Supervised Neural Radiance Fields for Sparse Satellite Images

Figure 2 for SparseSat-NeRF: Dense Depth Supervised Neural Radiance Fields for Sparse Satellite Images

Figure 3 for SparseSat-NeRF: Dense Depth Supervised Neural Radiance Fields for Sparse Satellite Images

Figure 4 for SparseSat-NeRF: Dense Depth Supervised Neural Radiance Fields for Sparse Satellite Images

Abstract:Digital surface model generation using traditional multi-view stereo matching (MVS) performs poorly over non-Lambertian surfaces, with asynchronous acquisitions, or at discontinuities. Neural radiance fields (NeRF) offer a new paradigm for reconstructing surface geometries using continuous volumetric representation. NeRF is self-supervised, does not require ground truth geometry for training, and provides an elegant way to include in its representation physical parameters about the scene, thus potentially remedying the challenging scenarios where MVS fails. However, NeRF and its variants require many views to produce convincing scene's geometries which in earth observation satellite imaging is rare. In this paper we present SparseSat-NeRF (SpS-NeRF) - an extension of Sat-NeRF adapted to sparse satellite views. SpS-NeRF employs dense depth supervision guided by crosscorrelation similarity metric provided by traditional semi-global MVS matching. We demonstrate the effectiveness of our approach on stereo and tri-stereo Pleiades 1B/WorldView-3 images, and compare against NeRF and Sat-NeRF. The code is available at https://github.com/LulinZhang/SpS-NeRF

* ISPRS Annals 2023

Via

Access Paper or Ask Questions

DeepSim-Nets: Deep Similarity Networks for Stereo Image Matching

Apr 17, 2023

Mohamed Ali Chebbi, Ewelina Rupnik, Marc Pierrot-Deseilligny, Paul Lopes

Figure 1 for DeepSim-Nets: Deep Similarity Networks for Stereo Image Matching

Figure 2 for DeepSim-Nets: Deep Similarity Networks for Stereo Image Matching

Figure 3 for DeepSim-Nets: Deep Similarity Networks for Stereo Image Matching

Figure 4 for DeepSim-Nets: Deep Similarity Networks for Stereo Image Matching

Abstract:We present three multi-scale similarity learning architectures, or DeepSim networks. These models learn pixel-level matching with a contrastive loss and are agnostic to the geometry of the considered scene. We establish a middle ground between hybrid and end-to-end approaches by learning to densely allocate all corresponding pixels of an epipolar pair at once. Our features are learnt on large image tiles to be expressive and capture the scene's wider context. We also demonstrate that curated sample mining can enhance the overall robustness of the predicted similarities and improve the performance on radiometrically homogeneous areas. We run experiments on aerial and satellite datasets. Our DeepSim-Nets outperform the baseline hybrid approaches and generalize better to unseen scene geometries than end-to-end methods. Our flexible architecture can be readily adopted in standard multi-resolution image matching pipelines.

Via

Access Paper or Ask Questions

Pointless Global Bundle Adjustment With Relative Motions Hessians

Apr 11, 2023

Ewelina Rupnik, Marc Pierrot-Deseilligny

Figure 1 for Pointless Global Bundle Adjustment With Relative Motions Hessians

Figure 2 for Pointless Global Bundle Adjustment With Relative Motions Hessians

Figure 3 for Pointless Global Bundle Adjustment With Relative Motions Hessians

Figure 4 for Pointless Global Bundle Adjustment With Relative Motions Hessians

Abstract:Bundle adjustment (BA) is the standard way to optimise camera poses and to produce sparse representations of a scene. However, as the number of camera poses and features grows, refinement through bundle adjustment becomes inefficient. Inspired by global motion averaging methods, we propose a new bundle adjustment objective which does not rely on image features' reprojection errors yet maintains precision on par with classical BA. Our method averages over relative motions while implicitly incorporating the contribution of the structure in the adjustment. To that end, we weight the objective function by local hessian matrices - a by-product of local bundle adjustments performed on relative motions (e.g., pairs or triplets) during the pose initialisation step. Such hessians are extremely rich as they encapsulate both the features' random errors and the geometric configuration between the cameras. These pieces of information propagated to the global frame help to guide the final optimisation in a more rigorous way. We argue that this approach is an upgraded version of the motion averaging approach and demonstrate its effectiveness on both photogrammetric datasets and computer vision benchmarks.

Via

Access Paper or Ask Questions

A pipeline for automated processing of Corona KH-4 stereo imagery

Jan 19, 2022

Sajid Ghuffar, Tobias Bolch, Ewelina Rupnik, Atanu Bhattacharya

Figure 1 for A pipeline for automated processing of Corona KH-4 stereo imagery

Figure 2 for A pipeline for automated processing of Corona KH-4 stereo imagery

Figure 3 for A pipeline for automated processing of Corona KH-4 stereo imagery

Figure 4 for A pipeline for automated processing of Corona KH-4 stereo imagery

Abstract:The Corona KH-4 reconnaissance satellite missions from 1962-1972 acquired panoramic stereo imagery with high spatial resolution of 1.8-7.5 m. The potential of 800,000+ declassified Corona images has not been leveraged due to the complexities arising from handling of panoramic imaging geometry, film distortions and limited availability of the metadata required for georeferencing of the Corona imagery. This paper presents Corona Stereo Pipeline (CoSP): A pipeline for processing of Corona KH-4 stereo panoramic imagery. CoSP utlizes a deep learning based feature matcher SuperGlue to automatically match features point between Corona KH-4 images and recent satellite imagery to generate Ground Control Points (GCPs). To model the imaging geometry and the scanning motion of the panoramic KH-4 cameras, a rigorous camera model consisting of modified collinearity equations with time dependent exterior orientation parameters is employed. The results show that using the entire frame of the Corona image, bundle adjustment using well-distributed GCPs results in an average standard deviation (SD) of less than 2 pixels. The distortion pattern of image residuals of GCPs and y-parallax in epipolar resampled images suggest that film distortions due to long term storage as likely cause of systematic deviations. Compared to the SRTM DEM, the Corona DEM computed using CoSP achieved a Normalized Median Absolute Deviation (NMAD) of elevation differences of ~4 m over an area of approx. 4000 $km^2$. We show that the proposed pipeline can be applied to sequence of complex scenes involving high relief and glacierized terrain and that the resulting DEMs can be used to compute long term glacier elevation changes over large areas.

* 24 Pages, 16 Figures

Via

Access Paper or Ask Questions

Feature matching for multi-epoch historical aerial images

Dec 08, 2021

Lulin Zhang, Ewelina Rupnik, Marc Pierrot-Deseilligny

Figure 1 for Feature matching for multi-epoch historical aerial images

Figure 2 for Feature matching for multi-epoch historical aerial images

Figure 3 for Feature matching for multi-epoch historical aerial images

Figure 4 for Feature matching for multi-epoch historical aerial images

Abstract:Historical imagery is characterized by high spatial resolution and stereo-scopic acquisitions, providing a valuable resource for recovering 3D land-cover information. Accurate geo-referencing of diachronic historical images by means of self-calibration remains a bottleneck because of the difficulty to find sufficient amount of feature correspondences under evolving landscapes. In this research, we present a fully automatic approach to detecting feature correspondences between historical images taken at different times (i.e., inter-epoch), without auxiliary data required. Based on relative orientations computed within the same epoch (i.e., intra-epoch), we obtain DSMs (Digital Surface Model) and incorporate them in a rough-to-precise matching. The method consists of: (1) an inter-epoch DSMs matching to roughly co-register the orientations and DSMs (i.e, the 3D Helmert transformation), followed by (2) a precise inter-epoch feature matching using the original RGB images. The innate ambiguity of the latter is largely alleviated by narrowing down the search space using the co-registered data. With the inter-epoch features, we refine the image orientations and quantitatively evaluate the results (1) with DoD (Difference of DSMs), (2) with ground check points, and (3) by quantifying ground displacement due to an earthquake. We demonstrate that our method: (1) can automatically georeference diachronic historical images; (2) can effectively mitigate systematic errors induced by poorly estimated camera parameters; (3) is robust to drastic scene changes. Compared to the state-of-the-art, our method improves the image georeferencing accuracy by a factor of 2. The proposed methods are implemented in MicMac, a free, open-source photogrammetric software.

* ISPRS Journal of Photogrammetry and Remote Sensing, 2021
* 34 pages

Via

Access Paper or Ask Questions