Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stepan Tulyakov

Beyond Calibration: Physically Informed Learning for Raw-to-Raw Mapping

Jun 11, 2025

Peter Grönquist, Stepan Tulyakov, Dengxin Dai

Figure 1 for Beyond Calibration: Physically Informed Learning for Raw-to-Raw Mapping

Figure 2 for Beyond Calibration: Physically Informed Learning for Raw-to-Raw Mapping

Figure 3 for Beyond Calibration: Physically Informed Learning for Raw-to-Raw Mapping

Figure 4 for Beyond Calibration: Physically Informed Learning for Raw-to-Raw Mapping

Abstract:Achieving consistent color reproduction across multiple cameras is essential for seamless image fusion and Image Processing Pipeline (ISP) compatibility in modern devices, but it is a challenging task due to variations in sensors and optics. Existing raw-to-raw conversion methods face limitations such as poor adaptability to changing illumination, high computational costs, or impractical requirements such as simultaneous camera operation and overlapping fields-of-view. We introduce the Neural Physical Model (NPM), a lightweight, physically-informed approach that simulates raw images under specified illumination to estimate transformations between devices. The NPM effectively adapts to varying illumination conditions, can be initialized with physical measurements, and supports training with or without paired data. Experiments on public datasets like NUS and BeyondRGB demonstrate that NPM outperforms recent state-of-the-art methods, providing robust chromatic consistency across different sensors and optical systems.

Via

Access Paper or Ask Questions

Event-based Image Deblurring with Dynamic Motion Awareness

Aug 24, 2022

Patricia Vitoria, Stamatios Georgoulis, Stepan Tulyakov, Alfredo Bochicchio, Julius Erbach, Yuanyou Li

Figure 1 for Event-based Image Deblurring with Dynamic Motion Awareness

Figure 2 for Event-based Image Deblurring with Dynamic Motion Awareness

Figure 3 for Event-based Image Deblurring with Dynamic Motion Awareness

Figure 4 for Event-based Image Deblurring with Dynamic Motion Awareness

Abstract:Non-uniform image deblurring is a challenging task due to the lack of temporal and textural information in the blurry image itself. Complementary information from auxiliary sensors such event sensors are being explored to address these limitations. The latter can record changes in a logarithmic intensity asynchronously, called events, with high temporal resolution and high dynamic range. Current event-based deblurring methods combine the blurry image with events to jointly estimate per-pixel motion and the deblur operator. In this paper, we argue that a divide-and-conquer approach is more suitable for this task. To this end, we propose to use modulated deformable convolutions, whose kernel offsets and modulation masks are dynamically estimated from events to encode the motion in the scene, while the deblur operator is learned from the combination of blurry image and corresponding events. Furthermore, we employ a coarse-to-fine multi-scale reconstruction approach to cope with the inherent sparsity of events in low contrast regions. Importantly, we introduce the first dataset containing pairs of real RGB blur images and related events during the exposure time. Our results show better overall robustness when using events, with improvements in PSNR by up to 1.57dB on synthetic data and 1.08 dB on real event data.

* ECCVW 2022

Via

Access Paper or Ask Questions

Time Lens++: Event-based Frame Interpolation with Parametric Non-linear Flow and Multi-scale Fusion

Mar 31, 2022

Stepan Tulyakov, Alfredo Bochicchio, Daniel Gehrig, Stamatios Georgoulis, Yuanyou Li, Davide Scaramuzza

Figure 1 for Time Lens++: Event-based Frame Interpolation with Parametric Non-linear Flow and Multi-scale Fusion

Figure 2 for Time Lens++: Event-based Frame Interpolation with Parametric Non-linear Flow and Multi-scale Fusion

Figure 3 for Time Lens++: Event-based Frame Interpolation with Parametric Non-linear Flow and Multi-scale Fusion

Figure 4 for Time Lens++: Event-based Frame Interpolation with Parametric Non-linear Flow and Multi-scale Fusion

Abstract:Recently, video frame interpolation using a combination of frame- and event-based cameras has surpassed traditional image-based methods both in terms of performance and memory efficiency. However, current methods still suffer from (i) brittle image-level fusion of complementary interpolation results, that fails in the presence of artifacts in the fused image, (ii) potentially temporally inconsistent and inefficient motion estimation procedures, that run for every inserted frame and (iii) low contrast regions that do not trigger events, and thus cause events-only motion estimation to generate artifacts. Moreover, previous methods were only tested on datasets consisting of planar and faraway scenes, which do not capture the full complexity of the real world. In this work, we address the above problems by introducing multi-scale feature-level fusion and computing one-shot non-linear inter-frame motion from events and images, which can be efficiently sampled for image warping. We also collect the first large-scale events and frames dataset consisting of more than 100 challenging scenes with depth variations, captured with a new experimental setup based on a beamsplitter. We show that our method improves the reconstruction quality by up to 0.2 dB in terms of PSNR and up to 15% in LPIPS score.

* IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 2022

Via

Access Paper or Ask Questions

Multi-Bracket High Dynamic Range Imaging with Event Cameras

Mar 13, 2022

Nico Messikommer, Stamatios Georgoulis, Daniel Gehrig, Stepan Tulyakov, Julius Erbach, Alfredo Bochicchio, Yuanyou Li, Davide Scaramuzza

Figure 1 for Multi-Bracket High Dynamic Range Imaging with Event Cameras

Figure 2 for Multi-Bracket High Dynamic Range Imaging with Event Cameras

Figure 3 for Multi-Bracket High Dynamic Range Imaging with Event Cameras

Figure 4 for Multi-Bracket High Dynamic Range Imaging with Event Cameras

Abstract:Modern high dynamic range (HDR) imaging pipelines align and fuse multiple low dynamic range (LDR) images captured at different exposure times. While these methods work well in static scenes, dynamic scenes remain a challenge since the LDR images still suffer from saturation and noise. In such scenarios, event cameras would be a valid complement, thanks to their higher temporal resolution and dynamic range. In this paper, we propose the first multi-bracket HDR pipeline combining a standard camera with an event camera. Our results show better overall robustness when using events, with improvements in PSNR by up to 5dB on synthetic data and up to 0.7dB on real-world data. We also introduce a new dataset containing bracketed LDR images with aligned events and HDR ground truth.

Via

Access Paper or Ask Questions

TimeLens: Event-based Video Frame Interpolation

Jun 14, 2021

Stepan Tulyakov, Daniel Gehrig, Stamatios Georgoulis, Julius Erbach, Mathias Gehrig, Yuanyou Li, Davide Scaramuzza

Figure 1 for TimeLens: Event-based Video Frame Interpolation

Figure 2 for TimeLens: Event-based Video Frame Interpolation

Figure 3 for TimeLens: Event-based Video Frame Interpolation

Figure 4 for TimeLens: Event-based Video Frame Interpolation

Abstract:State-of-the-art frame interpolation methods generate intermediate frames by inferring object motions in the image from consecutive key-frames. In the absence of additional information, first-order approximations, i.e. optical flow, must be used, but this choice restricts the types of motions that can be modeled, leading to errors in highly dynamic scenarios. Event cameras are novel sensors that address this limitation by providing auxiliary visual information in the blind-time between frames. They asynchronously measure per-pixel brightness changes and do this with high temporal resolution and low latency. Event-based frame interpolation methods typically adopt a synthesis-based approach, where predicted frame residuals are directly applied to the key-frames. However, while these approaches can capture non-linear motions they suffer from ghosting and perform poorly in low-texture regions with few events. Thus, synthesis-based and flow-based approaches are complementary. In this work, we introduce Time Lens, a novel indicates equal contribution method that leverages the advantages of both. We extensively evaluate our method on three synthetic and two real benchmarks where we show an up to 5.21 dB improvement in terms of PSNR over state-of-the-art frame-based and event-based methods. Finally, we release a new large-scale dataset in highly dynamic scenarios, aimed at pushing the limits of existing methods.

* IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Via

Access Paper or Ask Questions

Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching

Jun 05, 2018

Stepan Tulyakov, Anton Ivanov, Francois Fleuret

Figure 1 for Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching

Figure 2 for Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching

Figure 3 for Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching

Figure 4 for Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching

Abstract:End-to-end deep-learning networks recently demonstrated extremely good perfor- mance for stereo matching. However, existing networks are difficult to use for practical applications since (1) they are memory-hungry and unable to process even modest-size images, (2) they have to be trained for a given disparity range. The Practical Deep Stereo (PDS) network that we propose addresses both issues: First, its architecture relies on novel bottleneck modules that drastically reduce the memory footprint in inference, and additional design choices allow to handle greater image size during training. This results in a model that leverages large image context to resolve matching ambiguities. Second, a novel sub-pixel cross- entropy loss combined with a MAP estimator make this network less sensitive to ambiguous matches, and applicable to any disparity range without re-training. We compare PDS to state-of-the-art methods published over the recent months, and demonstrate its superior performance on FlyingThings3D and KITTI sets.

Via

Access Paper or Ask Questions

Geometric calibration of Colour and Stereo Surface Imaging System of ESA's Trace Gas Orbiter

Jul 03, 2017

Stepan Tulyakov, Anton Ivanov, Nicolas Thomas, Victoria Roloff, Antoine Pommerol, Gabriele Cremonese, Thomas Weigel, Francois Fleuret

Figure 1 for Geometric calibration of Colour and Stereo Surface Imaging System of ESA's Trace Gas Orbiter

Figure 2 for Geometric calibration of Colour and Stereo Surface Imaging System of ESA's Trace Gas Orbiter

Figure 3 for Geometric calibration of Colour and Stereo Surface Imaging System of ESA's Trace Gas Orbiter

Figure 4 for Geometric calibration of Colour and Stereo Surface Imaging System of ESA's Trace Gas Orbiter

Abstract:There are many geometric calibration methods for "standard" cameras. These methods, however, cannot be used for the calibration of telescopes with large focal lengths and complex off-axis optics. Moreover, specialized calibration methods for the telescopes are scarce in literature. We describe the calibration method that we developed for the Colour and Stereo Surface Imaging System (CaSSIS) telescope, on board of the ExoMars Trace Gas Orbiter (TGO). Although our method is described in the context of CaSSIS, with camera-specific experiments, it is general and can be applied to other telescopes. We further encourage re-use of the proposed method by making our calibration code and data available on-line.

* Submitted to Advances in Space Research

Via

Access Paper or Ask Questions

Semi-supervised learning of deep metrics for stereo reconstruction

Dec 03, 2016

Stepan Tulyakov, Anton Ivanov, Francois Fleuret

Figure 1 for Semi-supervised learning of deep metrics for stereo reconstruction

Figure 2 for Semi-supervised learning of deep metrics for stereo reconstruction

Figure 3 for Semi-supervised learning of deep metrics for stereo reconstruction

Figure 4 for Semi-supervised learning of deep metrics for stereo reconstruction

Abstract:Deep-learning metrics have recently demonstrated extremely good performance to match image patches for stereo reconstruction. However, training such metrics requires large amount of labeled stereo images, which can be difficult or costly to collect for certain applications. The main contribution of our work is a new semi-supervised method for learning deep metrics from unlabeled stereo images, given coarse information about the scenes and the optical system. Our method alternatively optimizes the metric with a standard stochastic gradient descent, and applies stereo constraints to regularize its prediction. Experiments on reference data-sets show that, for a given network architecture, training with this new method without ground-truth produces a metric with performance as good as state-of-the-art baselines trained with the said ground-truth. This work has three practical implications. Firstly, it helps to overcome limitations of training sets, in particular noisy ground truth. Secondly it allows to use much more training data during learning. Thirdly, it allows to tune deep metric for a particular stereo system, even if ground truth is not available.

* 11 pages, 3 figures

Via

Access Paper or Ask Questions