Abstract:Object pose estimation is a prominent task in computer vision. The object pose gives the orientation and translation of the object in real-world space, which allows various applications such as manipulation, augmented reality, etc. Various objects exhibit different properties with light, such as reflections, absorption, etc. This makes it challenging to understand the object's structure in RGB and depth channels. Recent research has been moving toward learning-based methods, which provide a more flexible and generalizable approach to object pose estimation utilizing deep learning. One such approach is the render-and-compare method, which renders the object from multiple views and compares it against the given 2D image, which often requires an object representation in the form of a CAD model. We reason that the synthetic texture of the CAD model may not be ideal for rendering and comparing operations. We showed that if the object is represented as an implicit (neural) representation in the form of Neural Radiance Field (NeRF), it exhibits a more realistic rendering of the actual scene and retains the crucial spatial features, which makes the comparison more versatile. We evaluated our NeRF implementation of the render-and-compare method on transparent datasets and found that it surpassed the current state-of-the-art results.
Abstract:Object pose estimation is essential to many industrial applications involving robotic manipulation, navigation, and augmented reality. Current generalizable object pose estimators, i.e., approaches that do not need to be trained per object, rely on accurate 3D models. Predominantly, CAD models are used, which can be hard to obtain in practice. At the same time, it is often possible to acquire images of an object. Naturally, this leads to the question whether 3D models reconstructed from images are sufficient to facilitate accurate object pose estimation. We aim to answer this question by proposing a novel benchmark for measuring the impact of 3D reconstruction quality on pose estimation accuracy. Our benchmark provides calibrated images for object reconstruction registered with the test images of the YCB-V dataset for pose evaluation under the BOP benchmark format. Detailed experiments with multiple state-of-the-art 3D reconstruction and object pose estimation approaches show that the geometry produced by modern reconstruction methods is often sufficient for accurate pose estimation. Our experiments lead to interesting observations: (1) Standard metrics for measuring 3D reconstruction quality are not necessarily indicative of pose estimation accuracy, which shows the need for dedicated benchmarks such as ours. (2) Classical, non-learning-based approaches can perform on par with modern learning-based reconstruction techniques and can even offer a better reconstruction time-pose accuracy tradeoff. (3) There is still a sizable gap between performance with reconstructed and with CAD models. To foster research on closing this gap, our benchmark is publicly available at https://github.com/VarunBurde/reconstruction_pose_benchmark}.
Abstract:This study focuses on the energy optimization of industrial robotic cells, which is essential for sustainable production in the long term. A holistic approach that considers a robotic cell as a whole toward minimizing energy consumption is proposed. The mathematical model, which takes into account various robot speeds, positions, power-saving modes, and alternative orders of operations, can be transformed into a mixed-integer linear programming formulation that is, however, suitable only for small instances. To optimize complex robotic cells, a hybrid heuristic accelerated by using multicore processors and the Gurobi simplex method for piecewise linear convex functions is implemented. The experimental results showed that the heuristic solved 93 % of instances with a solution quality close to a proven lower bound. Moreover, compared with the existing works, which typically address problems with three to four robots, this study solved real-size problem instances with up to 12 robots and considered more optimization aspects. The proposed algorithms were also applied on an existing robotic cell in \v{S}koda Auto. The outcomes, based on simulations and measurements, indicate that, compared with the previous state (at maximal robot speeds and without deeper power-saving modes), the energy consumption can be reduced by about 20 % merely by optimizing the robot speeds and applying power-saving modes. All the software and generated datasets used in this research are publicly available.