Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Frank Michel

Demonstrating Mobile Manipulation in the Wild: A Metrics-Driven Approach

Jan 03, 2024

Max Bajracharya, James Borders, Richard Cheng, Dan Helmick, Lukas Kaul, Dan Kruse, John Leichty, Jeremy Ma, Carolyn Matl, Frank Michel(+4 more)

Figure 1 for Demonstrating Mobile Manipulation in the Wild: A Metrics-Driven Approach

Figure 2 for Demonstrating Mobile Manipulation in the Wild: A Metrics-Driven Approach

Figure 3 for Demonstrating Mobile Manipulation in the Wild: A Metrics-Driven Approach

Figure 4 for Demonstrating Mobile Manipulation in the Wild: A Metrics-Driven Approach

Abstract:We present our general-purpose mobile manipulation system consisting of a custom robot platform and key algorithms spanning perception and planning. To extensively test the system in the wild and benchmark its performance, we choose a grocery shopping scenario in an actual, unmodified grocery store. We derive key performance metrics from detailed robot log data collected during six week-long field tests, spread across 18 months. These objective metrics, gained from complex yet repeatable tests, drive the direction of our research efforts and let us continuously improve our system's performance. We find that thorough end-to-end system-level testing of a complex mobile manipulation system can serve as a reality-check for state-of-the-art methods in robotics. This effectively grounds robotics research efforts in real world needs and challenges, which we deem highly useful for the advancement of the field. To this end, we share our key insights and takeaways to inspire and accelerate similar system-level research projects.

* Presented at RSS 2023 [Best Demo Paper Award]

Via

Access Paper or Ask Questions

BOP Challenge 2020 on 6D Object Localization

Oct 13, 2020

Tomas Hodan, Martin Sundermeyer, Bertram Drost, Yann Labbe, Eric Brachmann, Frank Michel, Carsten Rother, Jiri Matas

Figure 1 for BOP Challenge 2020 on 6D Object Localization

Figure 2 for BOP Challenge 2020 on 6D Object Localization

Figure 3 for BOP Challenge 2020 on 6D Object Localization

Figure 4 for BOP Challenge 2020 on 6D Object Localization

Abstract:This paper presents the evaluation methodology, datasets, and results of the BOP Challenge 2020, the third in a series of public competitions organized with the goal to capture the status quo in the field of 6D object pose estimation from an RGB-D image. In 2020, to reduce the domain gap between synthetic training and real test RGB images, the participants were provided 350K photorealistic training images generated by BlenderProc4BOP, a new open-source and light-weight physically-based renderer (PBR) and procedural data generator. Methods based on deep neural networks have finally caught up with methods based on point pair features, which were dominating previous editions of the challenge. Although the top-performing methods rely on RGB-D image channels, strong results were achieved when only RGB channels were used at both training and test time - out of the 26 evaluated methods, the third method was trained on RGB channels of PBR and real images, while the fifth on RGB channels of PBR images only. Strong data augmentation was identified as a key component of the top-performing CosyPose method, and the photorealism of PBR images was demonstrated effective despite the augmentation. The online evaluation system stays open and is available on the project website: bop.felk.cvut.cz.

* In ECCV 2020 Workshops Proceedings

Via

Access Paper or Ask Questions

A Summary of the 4th International Workshop on Recovering 6D Object Pose

Oct 09, 2018

Tomas Hodan, Rigas Kouskouridas, Tae-Kyun Kim, Federico Tombari, Kostas Bekris, Bertram Drost, Thibault Groueix, Krzysztof Walas, Vincent Lepetit, Ales Leonardis(+5 more)

Figure 1 for A Summary of the 4th International Workshop on Recovering 6D Object Pose

Abstract:This document summarizes the 4th International Workshop on Recovering 6D Object Pose which was organized in conjunction with ECCV 2018 in Munich. The workshop featured four invited talks, oral and poster presentations of accepted workshop papers, and an introduction of the BOP benchmark for 6D object pose estimation. The workshop was attended by 100+ people working on relevant topics in both academia and industry who shared up-to-date advances and discussed open problems.

* In: Computer Vision - ECCV 2018 Workshops - Munich, Germany, September 8-9 and 14, 2018, Proceedings

Via

Access Paper or Ask Questions

BOP: Benchmark for 6D Object Pose Estimation

Aug 24, 2018

Tomas Hodan, Frank Michel, Eric Brachmann, Wadim Kehl, Anders Glent Buch, Dirk Kraft, Bertram Drost, Joel Vidal, Stephan Ihrke, Xenophon Zabulis(+6 more)

Figure 1 for BOP: Benchmark for 6D Object Pose Estimation

Figure 2 for BOP: Benchmark for 6D Object Pose Estimation

Figure 3 for BOP: Benchmark for 6D Object Pose Estimation

Figure 4 for BOP: Benchmark for 6D Object Pose Estimation

Abstract:We propose a benchmark for 6D pose estimation of a rigid object from a single RGB-D input image. The training data consists of a texture-mapped 3D object model or images of the object in known 6D poses. The benchmark comprises of: i) eight datasets in a unified format that cover different practical scenarios, including two new datasets focusing on varying lighting conditions, ii) an evaluation methodology with a pose-error function that deals with pose ambiguities, iii) a comprehensive evaluation of 15 diverse recent methods that captures the status quo of the field, and iv) an online evaluation system that is open for continuous submission of new results. The evaluation shows that methods based on point-pair features currently perform best, outperforming template matching methods, learning-based methods and methods based on 3D local features. The project website is available at bop.felk.cvut.cz.

* ECCV 2018

Via

Access Paper or Ask Questions

DSAC - Differentiable RANSAC for Camera Localization

Mar 21, 2018

Eric Brachmann, Alexander Krull, Sebastian Nowozin, Jamie Shotton, Frank Michel, Stefan Gumhold, Carsten Rother

Figure 1 for DSAC - Differentiable RANSAC for Camera Localization

Figure 2 for DSAC - Differentiable RANSAC for Camera Localization

Figure 3 for DSAC - Differentiable RANSAC for Camera Localization

Figure 4 for DSAC - Differentiable RANSAC for Camera Localization

Abstract:RANSAC is an important algorithm in robust optimization and a central building block for many computer vision applications. In recent years, traditionally hand-crafted pipelines have been replaced by deep learning pipelines, which can be trained in an end-to-end fashion. However, RANSAC has so far not been used as part of such deep learning pipelines, because its hypothesis selection procedure is non-differentiable. In this work, we present two different ways to overcome this limitation. The most promising approach is inspired by reinforcement learning, namely to replace the deterministic hypothesis selection by a probabilistic selection for which we can derive the expected loss w.r.t. to all learnable parameters. We call this approach DSAC, the differentiable counterpart of RANSAC. We apply DSAC to the problem of camera localization, where deep learning has so far failed to improve on traditional approaches. We demonstrate that by directly minimizing the expected loss of the output camera poses, robustly estimated by RANSAC, we achieve an increase in accuracy. In the future, any deep learning pipeline can use DSAC as a robust optimization component.

* CVPR 2017

Via

Access Paper or Ask Questions

PoseAgent: Budget-Constrained 6D Object Pose Estimation via Reinforcement Learning

Apr 11, 2017

Alexander Krull, Eric Brachmann, Sebastian Nowozin, Frank Michel, Jamie Shotton, Carsten Rother

Figure 1 for PoseAgent: Budget-Constrained 6D Object Pose Estimation via Reinforcement Learning

Figure 2 for PoseAgent: Budget-Constrained 6D Object Pose Estimation via Reinforcement Learning

Figure 3 for PoseAgent: Budget-Constrained 6D Object Pose Estimation via Reinforcement Learning

Figure 4 for PoseAgent: Budget-Constrained 6D Object Pose Estimation via Reinforcement Learning

Abstract:State-of-the-art computer vision algorithms often achieve efficiency by making discrete choices about which hypotheses to explore next. This allows allocation of computational resources to promising candidates, however, such decisions are non-differentiable. As a result, these algorithms are hard to train in an end-to-end fashion. In this work we propose to learn an efficient algorithm for the task of 6D object pose estimation. Our system optimizes the parameters of an existing state-of-the art pose estimation system using reinforcement learning, where the pose estimation system now becomes the stochastic policy, parametrized by a CNN. Additionally, we present an efficient training algorithm that dramatically reduces computation time. We show empirically that our learned pose estimation procedure makes better use of limited resources and improves upon the state-of-the-art on a challenging dataset. Our approach enables differentiable end-to-end training of complex algorithmic pipelines and learns to make optimal use of a given computational budget.

Via

Access Paper or Ask Questions

Global Hypothesis Generation for 6D Object Pose Estimation

Jan 02, 2017

Frank Michel, Alexander Kirillov, Eric Brachmann, Alexander Krull, Stefan Gumhold, Bogdan Savchynskyy, Carsten Rother

Figure 1 for Global Hypothesis Generation for 6D Object Pose Estimation

Figure 2 for Global Hypothesis Generation for 6D Object Pose Estimation

Figure 3 for Global Hypothesis Generation for 6D Object Pose Estimation

Figure 4 for Global Hypothesis Generation for 6D Object Pose Estimation

Abstract:This paper addresses the task of estimating the 6D pose of a known 3D object from a single RGB-D image. Most modern approaches solve this task in three steps: i) Compute local features; ii) Generate a pool of pose-hypotheses; iii) Select and refine a pose from the pool. This work focuses on the second step. While all existing approaches generate the hypotheses pool via local reasoning, e.g. RANSAC or Hough-voting, we are the first to show that global reasoning is beneficial at this stage. In particular, we formulate a novel fully-connected Conditional Random Field (CRF) that outputs a very small number of pose-hypotheses. Despite the potential functions of the CRF being non-Gaussian, we give a new and efficient two-step optimization procedure, with some guarantees for optimality. We utilize our global hypotheses generation procedure to produce results that exceed state-of-the-art for the challenging "Occluded Object Dataset".

Via

Access Paper or Ask Questions

Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images

Aug 19, 2015

Alexander Krull, Eric Brachmann, Frank Michel, Michael Ying Yang, Stefan Gumhold, Carsten Rother

Figure 1 for Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images

Figure 2 for Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images

Figure 3 for Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images

Figure 4 for Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images

Abstract:Analysis-by-synthesis has been a successful approach for many tasks in computer vision, such as 6D pose estimation of an object in an RGB-D image which is the topic of this work. The idea is to compare the observation with the output of a forward process, such as a rendered image of the object of interest in a particular pose. Due to occlusion or complicated sensor noise, it can be difficult to perform this comparison in a meaningful way. We propose an approach that "learns to compare", while taking these difficulties into account. This is done by describing the posterior density of a particular object pose with a convolutional neural network (CNN) that compares an observed and rendered image. The network is trained with the maximum likelihood paradigm. We observe empirically that the CNN does not specialize to the geometry or appearance of specific objects, and it can be used with objects of vastly different shapes and appearances, and in different backgrounds. Compared to state-of-the-art, we demonstrate a significant improvement on two different datasets which include a total of eleven objects, cluttered background, and heavy occlusion.

* 16 pages, 8 figures

Via

Access Paper or Ask Questions