Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Translating a Visual LEGO Manual to a Machine-Executable Plan

Jul 25, 2022

Ruocheng Wang, Yunzhi Zhang, Jiayuan Mao, Chin-Yi Cheng, Jiajun Wu

Figure 1 for Translating a Visual LEGO Manual to a Machine-Executable Plan

Figure 2 for Translating a Visual LEGO Manual to a Machine-Executable Plan

Figure 3 for Translating a Visual LEGO Manual to a Machine-Executable Plan

Share this with someone who'll enjoy it:

Abstract:We study the problem of translating an image-based, step-by-step assembly manual created by human designers into machine-interpretable instructions. We formulate this problem as a sequential prediction task: at each step, our model reads the manual, locates the components to be added to the current shape, and infers their 3D poses. This task poses the challenge of establishing a 2D-3D correspondence between the manual image and the real 3D object, and 3D pose estimation for unseen 3D objects, since a new component to be added in a step can be an object built from previous steps. To address these two challenges, we present a novel learning-based framework, the Manual-to-Executable-Plan Network (MEPNet), which reconstructs the assembly steps from a sequence of manual images. The key idea is to integrate neural 2D keypoint detection modules and 2D-3D projection algorithms for high-precision prediction and strong generalization to unseen components. The MEPNet outperforms existing methods on three newly collected LEGO manual datasets and a Minecraft house dataset.

* ECCV 2022. Project page: https://cs.stanford.edu/~rcwang/projects/lego_manual

View paper on

Share this with someone who'll enjoy it:

Title:Translating a Visual LEGO Manual to a Machine-Executable Plan

Paper and Code