Abstract:As collaborative robots move closer to human environments, motion generation and reactive planning strategies that allow for elaborate task execution with minimal easy-to-implement guidance whilst coping with changes in the environment is of paramount importance. In this paper, we present a novel approach for generating real-time motion plans for point-to-point tasks using a single successful human demonstration. Our approach is based on screw linear interpolation,which allows us to respect the underlying geometric constraints that characterize the task and are implicitly present in the demonstration. We also integrate an original reactive collision avoidance approach with our planner. We present extensive experimental results to demonstrate that with our approach,by using a single demonstration of moving one block, we can generate motion plans for complex tasks like stacking multiple blocks (in a dynamic environment). Analogous generalization abilities are also shown for tasks like pouring and loading shelves. For the pouring task, we also show that a demonstration given for one-armed pouring can be used for planning pouring with a dual-armed manipulator of different kinematic structure.
Abstract:Object categories inherently form a hierarchy with different levels of concept abstraction, especially for fine-grained categories. For example, birds (Aves) can be categorized according to a four-level hierarchy of order, family, genus, and species. This hierarchy encodes rich correlations among various categories across different levels, which can effectively regularize the semantic space and thus make prediction less ambiguous. However, previous studies of fine-grained image recognition primarily focus on categories of one certain level and usually overlook this correlation information. In this work, we investigate simultaneously predicting categories of different levels in the hierarchy and integrating this structured correlation information into the deep neural network by developing a novel Hierarchical Semantic Embedding (HSE) framework. Specifically, the HSE framework sequentially predicts the category score vector of each level in the hierarchy, from highest to lowest. At each level, it incorporates the predicted score vector of the higher level as prior knowledge to learn finer-grained feature representation. During training, the predicted score vector of the higher level is also employed to regularize label prediction by using it as soft targets of corresponding sub-categories. To evaluate the proposed framework, we organize the 200 bird species of the Caltech-UCSD birds dataset with the four-level category hierarchy and construct a large-scale butterfly dataset that also covers four level categories. Extensive experiments on these two and the newly-released VegFru datasets demonstrate the superiority of our HSE framework over the baseline methods and existing competitors.