Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wim Abbeloos

UniK3D: Universal Camera Monocular 3D Estimation

Mar 20, 2025

Luigi Piccinelli, Christos Sakaridis, Mattia Segu, Yung-Hsu Yang, Siyuan Li, Wim Abbeloos, Luc Van Gool

Abstract:Monocular 3D estimation is crucial for visual perception. However, current methods fall short by relying on oversimplified assumptions, such as pinhole camera models or rectified images. These limitations severely restrict their general applicability, causing poor performance in real-world scenarios with fisheye or panoramic images and resulting in substantial context loss. To address this, we present UniK3D, the first generalizable method for monocular 3D estimation able to model any camera. Our method introduces a spherical 3D representation which allows for better disentanglement of camera and scene geometry and enables accurate metric 3D reconstruction for unconstrained camera models. Our camera component features a novel, model-independent representation of the pencil of rays, achieved through a learned superposition of spherical harmonics. We also introduce an angular loss, which, together with the camera module design, prevents the contraction of the 3D outputs for wide-view cameras. A comprehensive zero-shot evaluation on 13 diverse datasets demonstrates the state-of-the-art performance of UniK3D across 3D, depth, and camera metrics, with substantial gains in challenging large-field-of-view and panoramic settings, while maintaining top accuracy in conventional pinhole small-field-of-view domains. Code and models are available at github.com/lpiccinelli-eth/unik3d .

Via

Access Paper or Ask Questions

UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler

Feb 27, 2025

Luigi Piccinelli, Christos Sakaridis, Yung-Hsu Yang, Mattia Segu, Siyuan Li, Wim Abbeloos, Luc Van Gool

Figure 1 for UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler

Figure 2 for UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler

Figure 3 for UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler

Figure 4 for UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler

Abstract:Accurate monocular metric depth estimation (MMDE) is crucial to solving downstream tasks in 3D perception and modeling. However, the remarkable accuracy of recent MMDE methods is confined to their training domains. These methods fail to generalize to unseen domains even in the presence of moderate domain gaps, which hinders their practical applicability. We propose a new model, UniDepthV2, capable of reconstructing metric 3D scenes from solely single images across domains. Departing from the existing MMDE paradigm, UniDepthV2 directly predicts metric 3D points from the input image at inference time without any additional information, striving for a universal and flexible MMDE solution. In particular, UniDepthV2 implements a self-promptable camera module predicting a dense camera representation to condition depth features. Our model exploits a pseudo-spherical output representation, which disentangles the camera and depth representations. In addition, we propose a geometric invariance loss that promotes the invariance of camera-prompted depth features. UniDepthV2 improves its predecessor UniDepth model via a new edge-guided loss which enhances the localization and sharpness of edges in the metric depth outputs, a revisited, simplified and more efficient architectural design, and an additional uncertainty-level output which enables downstream tasks requiring confidence. Thorough evaluations on ten depth datasets in a zero-shot regime consistently demonstrate the superior performance and generalization of UniDepthV2. Code and models are available at https://github.com/lpiccinelli-eth/UniDepth

* arXiv admin note: substantial text overlap with arXiv:2403.18913

Via

Access Paper or Ask Questions

MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation

Oct 01, 2021

Jonas Heylen, Mark De Wolf, Bruno Dawagne, Marc Proesmans, Luc Van Gool, Wim Abbeloos, Hazem Abdelkawy, Daniel Olmeda Reino

Figure 1 for MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation

Figure 2 for MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation

Figure 3 for MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation

Figure 4 for MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation

Abstract:Monocular 3D object detection has recently shown promising results, however there remain challenging problems. One of those is the lack of invariance to different camera intrinsic parameters, which can be observed across different 3D object datasets. Little effort has been made to exploit the combination of heterogeneous 3D object datasets. In contrast to general intuition, we show that more data does not automatically guarantee a better performance, but rather, methods need to have a degree of 'camera independence' in order to benefit from large and heterogeneous training data. In this paper we propose a category-level pose estimation method based on instance segmentation, using camera independent geometric reasoning to cope with the varying camera viewpoints and intrinsics of different datasets. Every pixel of an instance predicts the object dimensions, the 3D object reference points projected in 2D image space and, optionally, the local viewing angle. Camera intrinsics are only used outside of the learned network to lift the predicted 2D reference points to 3D. We surpass camera independent methods on the challenging KITTI3D benchmark and show the key benefits compared to camera dependent methods.

* Accepted to ICCV2021 Workshop on 3D Object Detection from Images

Via

Access Paper or Ask Questions

3D Object Discovery and Modeling Using Single RGB-D Images Containing Multiple Object Instances

Oct 17, 2017

Wim Abbeloos, Esra Ataer-Cansizoglu, Sergio Caccamo, Yuichi Taguchi, Yukiyasu Domae

Figure 1 for 3D Object Discovery and Modeling Using Single RGB-D Images Containing Multiple Object Instances

Figure 2 for 3D Object Discovery and Modeling Using Single RGB-D Images Containing Multiple Object Instances

Figure 3 for 3D Object Discovery and Modeling Using Single RGB-D Images Containing Multiple Object Instances

Figure 4 for 3D Object Discovery and Modeling Using Single RGB-D Images Containing Multiple Object Instances

Abstract:Unsupervised object modeling is important in robotics, especially for handling a large set of objects. We present a method for unsupervised 3D object discovery, reconstruction, and localization that exploits multiple instances of an identical object contained in a single RGB-D image. The proposed method does not rely on segmentation, scene knowledge, or user input, and thus is easily scalable. Our method aims to find recurrent patterns in a single RGB-D image by utilizing appearance and geometry of the salient regions. We extract keypoints and match them in pairs based on their descriptors. We then generate triplets of the keypoints matching with each other using several geometric criteria to minimize false matches. The relative poses of the matched triplets are computed and clustered to discover sets of triplet pairs with similar relative poses. Triplets belonging to the same set are likely to belong to the same object and are used to construct an initial object model. Detection of remaining instances with the initial object model using RANSAC allows to further expand and refine the model. The automatically generated object models are both compact and descriptive. We show quantitative and qualitative results on RGB-D images with various objects including some from the Amazon Picking Challenge. We also demonstrate the use of our method in an object picking scenario with a robotic arm.

* Proceedings International Conference on 3D Vision 2017 (pp. 431-439)

Via

Access Paper or Ask Questions

Detecting and Grouping Identical Objects for Region Proposal and Classification

Jul 23, 2017

Wim Abbeloos, Sergio Caccamo, Esra Ataer-Cansizoglu, Yuichi Taguchi, Chen Feng, Teng-Yok Lee

Figure 1 for Detecting and Grouping Identical Objects for Region Proposal and Classification

Figure 2 for Detecting and Grouping Identical Objects for Region Proposal and Classification

Figure 3 for Detecting and Grouping Identical Objects for Region Proposal and Classification

Abstract:Often multiple instances of an object occur in the same scene, for example in a warehouse. Unsupervised multi-instance object discovery algorithms are able to detect and identify such objects. We use such an algorithm to provide object proposals to a convolutional neural network (CNN) based classifier. This results in fewer regions to evaluate, compared to traditional region proposal algorithms. Additionally, it enables using the joint probability of multiple instances of an object, resulting in improved classification accuracy. The proposed technique can also split a single class into multiple sub-classes corresponding to the different object types, enabling hierarchical classification.

* IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Workshop Deep Learning for Robotic Vision, 21 July, 2017, Honolulu, Hawaii

Via

Access Paper or Ask Questions

Team Applied Robotics: A closer look at our robotic picking system

Jul 23, 2017

Wim Abbeloos, Fabian Gouwens, Simon Jansen, Berend Küpers, Maurice Ramaker, Toon Goedemé

Figure 1 for Team Applied Robotics: A closer look at our robotic picking system

Figure 2 for Team Applied Robotics: A closer look at our robotic picking system

Figure 3 for Team Applied Robotics: A closer look at our robotic picking system

Figure 4 for Team Applied Robotics: A closer look at our robotic picking system

Abstract:This paper describes the vision based robotic picking system that was developed by our team, Team Applied Robotics, for the Amazon Picking Challenge 2016. This competition challenged teams to develop a robotic system that is able to pick a large variety of products from a shelve or a tote. We discuss the design considerations and our strategy, the high resolution 3D vision system, the use of a combination of texture and shape-based object detection algorithms, the robot path planning and object manipulators that were developed.

* IEEE International Conference on Robotics and Automation (ICRA), Warehouse Picking Automation Workshop, May 29 to June 3, 2017, Singapore

Via

Access Paper or Ask Questions

Exploring the potential of combining time of flight and thermal infrared cameras for person detection

Dec 07, 2016

Wim Abbeloos, Toon Goedemé

Figure 1 for Exploring the potential of combining time of flight and thermal infrared cameras for person detection

Figure 2 for Exploring the potential of combining time of flight and thermal infrared cameras for person detection

Figure 3 for Exploring the potential of combining time of flight and thermal infrared cameras for person detection

Figure 4 for Exploring the potential of combining time of flight and thermal infrared cameras for person detection

Abstract:Combining new, low-cost thermal infrared and time-of-flight range sensors provides new opportunities. In this position paper we explore the possibilities of combining these sensors and using their fused data for person detection. The proposed calibration approach for this sensor combination differs from the traditional stereo camera calibration in two fundamental ways. A first distinction is that the spectral sensitivity of the two sensors differs significantly. In fact, there is no sensitivity range overlap at all. A second distinction is that their resolution is typically very low, which requires special attention. We assume a situation in which the sensors' relative position is known, but their orientation is unknown. In addition, some of the typical measurement errors are discussed, and methods to compensate for them are proposed. We discuss how the fused data could allow increased accuracy and robustness without the need for complex algorithms requiring large amounts of computational power and training data.

* Proceedings of the International Conference on Informatics in Control, Automation and Robotics (2013) 464-470
* Proceedings of the 10th International Conference on Informatics in Control, Automation and Robotics

Via

Access Paper or Ask Questions

Process Monitoring of Extrusion Based 3D Printing via Laser Scanning

Dec 07, 2016

Matthias Faes, Wim Abbeloos, Frederik Vogeler, Hans Valkenaers, Kurt Coppens, Toon Goedemé, Eleonora Ferraris

Figure 1 for Process Monitoring of Extrusion Based 3D Printing via Laser Scanning

Figure 2 for Process Monitoring of Extrusion Based 3D Printing via Laser Scanning

Figure 3 for Process Monitoring of Extrusion Based 3D Printing via Laser Scanning

Figure 4 for Process Monitoring of Extrusion Based 3D Printing via Laser Scanning

Abstract:Extrusion based 3D Printing (E3DP) is an Additive Manufacturing (AM) technique that extrudes thermoplastic polymer in order to build up components using a layerwise approach. Hereby, AM typically requires long production times in comparison to mass production processes such as Injection Molding. Failures during the AM process are often only noticed after build completion and frequently lead to part rejection because of dimensional inaccuracy or lack of mechanical performance, resulting in an important loss of time and material. A solution to improve the accuracy and robustness of a manufacturing technology is the integration of sensors to monitor and control process state-variables online. In this way, errors can be rapidly detected and possibly compensated at an early stage. To achieve this, we integrated a modular 2D laser triangulation scanner into an E3DP machine and analyzed feedback signals. A 2D laser triangulation scanner was selected here owing to the very compact size, achievable accuracy and the possibility of capturing geometrical 3D data. Thus, our implemented system is able to provide both quantitative and qualitative information. Also, in this work, first steps towards the development of a quality control loop for E3DP processes are presented and opportunities are discussed.

* Conference Proceedings PMI 6 (2014) 363-367
* International Conference on Polymers and Moulds Innovations(PMI) 2014

Via

Access Paper or Ask Questions

Embedded Line Scan Image Sensors: The Low Cost Alternative for High Speed Imaging

Dec 07, 2016

Stef Van Wolputte, Wim Abbeloos, Stijn Helsen, Abdellatif Bey-Temsamani, Toon Goedemé

Figure 1 for Embedded Line Scan Image Sensors: The Low Cost Alternative for High Speed Imaging

Figure 2 for Embedded Line Scan Image Sensors: The Low Cost Alternative for High Speed Imaging

Figure 3 for Embedded Line Scan Image Sensors: The Low Cost Alternative for High Speed Imaging

Figure 4 for Embedded Line Scan Image Sensors: The Low Cost Alternative for High Speed Imaging

Abstract:In this paper we propose a low-cost high-speed imaging line scan system. We replace an expensive industrial line scan camera and illumination with a custom-built set-up of cheap off-the-shelf components, yielding a measurement system with comparative quality while costing about 20 times less. We use a low-cost linear (1D) image sensor, cheap optics including a LED-based or LASER-based lighting and an embedded platform to process the images. A step-by-step method to design such a custom high speed imaging system and select proper components is proposed. Simulations allowing to predict the final image quality to be obtained by the set-up has been developed. Finally, we applied our method in a lab, closely representing the real-life cases. Our results shows that our simulations are very accurate and that our low-cost line scan set-up acquired image quality compared to the high-end commercial vision system, for a fraction of the price.

* Proceedings of the International Conference on Image Processing Theory, Tools and Applications (2015) 543-549
* 2015 International Conference on Image Processing Theory, Tools and Applications (IPTA)

Via

Access Paper or Ask Questions

Fusion of Range and Thermal Images for Person Detection

Dec 07, 2016

Wim Abbeloos, Toon Goedemé

Figure 1 for Fusion of Range and Thermal Images for Person Detection

Figure 2 for Fusion of Range and Thermal Images for Person Detection

Figure 3 for Fusion of Range and Thermal Images for Person Detection

Figure 4 for Fusion of Range and Thermal Images for Person Detection

Abstract:Detecting people in images is a challenging problem. Differences in pose, clothing and lighting, along with other factors, cause a lot of variation in their appearance. To overcome these issues, we propose a system based on fused range and thermal infrared images. These measurements show considerably less variation and provide more meaningful information. We provide a brief introduction to the sensor technology used and propose a calibration method. Several data fusion algorithms are compared and their performance is assessed on a simulated data set. The results of initial experiments on real data are analyzed and the measurement errors and the challenges they present are discussed. The resulting fused data are used to efficiently detect people in a fixed camera set-up. The system is extended to include person tracking.

* Proceedings Conferencia Internacional de Ingenier\'ia El\'ectrica 7 (2014) 1-4
* VII International Conference on Electrical Engineering FIE 2014, Santiago de Cuba

Via

Access Paper or Ask Questions