Abstract:Robotic kitting is a critical task in industrial automation that requires the precise arrangement of objects into kits to support downstream production processes. However, when handling complex kitting tasks that involve fine-grained orientation alignment, existing approaches often suffer from limited accuracy and computational efficiency. To address these challenges, we propose Histogram Transporter, a novel kitting framework that learns high-precision pick-and-place actions from scratch using only a few demonstrations. First, our method extracts rotation-equivariant orientation histograms (EOHs) from visual observations using an efficient Fourier-based discretization strategy. These EOHs serve a dual purpose: improving picking efficiency by directly modeling action success probabilities over high-resolution orientations and enhancing placing accuracy by serving as local, discriminative feature descriptors for object-to-placement matching. Second, we introduce a subgroup alignment strategy in the place model that compresses the full spectrum of EOHs into a compact orientation representation, enabling efficient feature matching while preserving accuracy. Finally, we examine the proposed framework on the simulated Hand-Tool Kitting Dataset (HTKD), where it outperforms competitive baselines in both success rates and computational efficiency. Further experiments on five Raven-10 tasks exhibits the remarkable adaptability of our approach, with real-robot trials confirming its applicability for real-world deployment.
Abstract:Robotic kitting has attracted considerable attention in logistics and industrial settings. However, existing kitting methods encounter challenges such as low precision and poor efficiency, limiting their widespread applications. To address these issues, we present a novel kitting framework that improves both the precision and computational efficiency of complex kitting tasks. Firstly, our approach introduces a fine-grained orientation estimation technique in the picking module, significantly enhancing orientation precision while effectively decoupling computational load from orientation granularity. This approach combines an SO(2)-equivariant network with a group discretization operation to preciously predict discrete orientation distributions. Secondly, we develop the Hand-tool Kitting Dataset (HKD) to evaluate the performance of different solutions in handling orientation-sensitive kitting tasks. This dataset comprises a diverse collection of hand tools and synthetically created kits, which reflects the complexities encountered in real-world kitting scenarios. Finally, a series of experiments are conducted to evaluate the performance of the proposed method. The results demonstrate that our approach offers remarkable precision and enhanced computational efficiency in robotic kitting tasks.