Abstract:Comprehensive perception of human beings is the prerequisite to ensure the safety of human-robot interaction. Currently, prevailing visual sensing approach typically involves a single static camera, resulting in a restricted and occluded field of view. In our work, we develop an active vision system using multiple cameras to dynamically capture multi-source RGB-D data. An integrated human sensing strategy based on a hierarchically connected tree structure is proposed to fuse localized visual information. Constituting the tree model are the nodes representing keypoints and the edges representing keyparts, which are consistently interconnected to preserve the structural constraints during multi-source fusion. Utilizing RGB-D data and HRNet, the 3D positions of keypoints are analytically estimated, and their presence is inferred through a sliding widow of confidence scores. Subsequently, the point clouds of reliable keyparts are extracted by drawing occlusion-resistant masks, enabling fine registration between data clouds and cylindrical model following the hierarchical order. Experimental results demonstrate that our method enhances keypart recognition recall from 69.20% to 90.10%, compared to employing a single static camera. Furthermore, in overcoming challenges related to localized and occluded perception, the robotic arm's obstacle avoidance capabilities are effectively improved.
Abstract:Collision detection via visual fences can significantly enhance the safety of collaborative robotic arms. Existing work typically performs such detection based on pre-deployed stationary cameras outside the robotic arm's workspace. These stationary cameras can only provide a restricted detection range and constrain the mobility of the robotic system. To cope with this issue, we propose an active sense method enabling a wide range of collision risk evaluation in dynamic scenarios. First, an active vision mechanism is implemented by equipping cameras with additional degrees of rotation. Considering the uncertainty in the active sense, we design a state confidence envelope to uniformly characterize both known and potential dynamic obstacles. Subsequently, using the observation-based uncertainty evolution, collision risk is evaluated by the prediction of obstacle envelopes. On this basis, a Markov decision process was employed to search for an optimal observation sequence of the active sense system, which enlarges the field of observation and reduces uncertainties in the state estimation of surrounding obstacles. Simulation and real-world experiments consistently demonstrate a 168% increase in the observation time coverage of typical dynamic humanoid obstacles compared to the method using stationary cameras, which underscores our system's effectiveness in collision risk tracking and enhancing the safety of robotic arms.