Abstract:Robot audition systems with multiple microphone arrays have many applications in practice. However, accurate calibration of multiple microphone arrays remains challenging because there are many unknown parameters to be identified, including the relative transforms (i.e., orientation, translation) and asynchronous factors (i.e., initial time offset and sampling clock difference) between microphone arrays. To tackle these challenges, in this paper, we adopt batch simultaneous localization and mapping (SLAM) for joint calibration of multiple asynchronous microphone arrays and sound source localization. Using the Fisher information matrix (FIM) approach, we first conduct the observability analysis (i.e., parameter identifiability) of the above-mentioned calibration problem and establish necessary/sufficient conditions under which the FIM and the Jacobian matrix have full column rank, which implies the identifiability of the unknown parameters. We also discover several scenarios where the unknown parameters are not uniquely identifiable. Subsequently, we propose an effective framework to initialize the unknown parameters, which is used as the initial guess in batch SLAM for multiple microphone arrays calibration, aiming to further enhance optimization accuracy and convergence. Extensive numerical simulations and real experiments have been conducted to verify the performance of the proposed method. The experiment results show that the proposed pipeline achieves higher accuracy with fast convergence in comparison to methods that use the noise-corrupted ground truth of the unknown parameters as the initial guess in the optimization and other existing frameworks.
Abstract:With the increasing deployment of agricultural robots, the traditional manual spray of liquid fertilizer and pesticide is gradually being replaced by agricultural robots. For robotic precision spray application in vegetable farms, accurate plant phenotyping through instance segmentation and robust plant tracking are of great importance and a prerequisite for the following spray action. Regarding the robust tracking of vegetable plants, to solve the challenging problem of associating vegetables with similar color and texture in consecutive images, in this paper, a novel method of Multiple Object Tracking and Segmentation (MOTS) is proposed for instance segmentation and tracking of multiple vegetable plants. In our approach, contour and blob features are extracted to describe unique feature of each individual vegetable, and associate the same vegetables in different images. By assigning a unique ID for each vegetable, it ensures the robot to spray each vegetable exactly once, while traversing along the farm rows. Comprehensive experiments including ablation studies are conducted, which prove its superior performance over two State-Of-The-Art (SOTA) MOTS methods. Compared to the conventional MOTS methods, the proposed method is able to re-identify objects which have gone out of the camera field of view and re-appear again using the proposed data association strategy, which is important to ensure each vegetable be sprayed only once when the robot travels back and forth. Although the method is tested on lettuce farm, it can be applied to other similar vegetables such as broccoli and canola. Both code and the dataset of this paper is publicly released for the benefit of the community: https://github.com/NanH5837/LettuceMOTS.
Abstract:Multiple microphone arrays have many applications in robot audition, including sound source localization, audio scene perception and analysis, etc. However, accurate calibration of multiple microphone arrays remains a challenge because there are many unknown parameters to be identified, including the Euler angles, geometry, asynchronous factors between the microphone arrays. This paper is concerned with joint calibration of multiple microphone arrays and sound source localization using graph simultaneous localization and mapping (SLAM). By using a Fisher information matrix (FIM) approach, we focus on the observability analysis of the graph SLAM framework for the above-mentioned calibration problem. We thoroughly investigate the identifiability of the unknown parameters, including the Euler angles, geometry, asynchronous effects between the microphone arrays, and the sound source locations. We establish necessary/sufficient conditions under which the FIM and the Jacobian matrix have full column rank, which implies the identifiability of the unknown parameters. These conditions are closely related to the variation in the motion of the sound source and the configuration of microphone arrays, and have intuitive and physical interpretations. We also discover several scenarios where the unknown parameters are not uniquely identifiable. All theoretical findings are demonstrated using simulation data.
Abstract:Acquiring the accurate 3-D position of a target person around a robot provides fundamental and valuable information that is applicable to a wide range of robotic tasks, including home service, navigation and entertainment. This paper presents a real-time robotic 3-D human tracking system which combines a monocular camera with an ultrasonic sensor by the extended Kalman filter (EKF). The proposed system consists of three sub-modules: monocular camera sensor tracking model, ultrasonic sensor tracking model and multi-sensor fusion. An improved visual tracking algorithm is presented to provide partial location estimation (2-D). The algorithm is designed to overcome severe occlusions, scale variation, target missing and achieve robust re-detection. The scale accuracy is further enhanced by the estimated 3-D information. An ultrasonic sensor array is employed to provide the range information from the target person to the robot and Gaussian Process Regression is used for partial location estimation (2-D). EKF is adopted to sequentially process multiple, heterogeneous measurements arriving in an asynchronous order from the vision sensor and the ultrasonic sensor separately. In the experiments, the proposed tracking system is tested in both simulation platform and actual mobile robot for various indoor and outdoor scenes. The experimental results show the superior performance of the 3-D tracking system in terms of both the accuracy and robustness.
Abstract:The main contribution of this paper is an invariant extended Kalman filter (EKF) for visual inertial navigation systems (VINS). It is demonstrated that the conventional EKF based VINS is not invariant under the stochastic unobservable transformation, associated with translations and a rotation about the gravitational direction. This can lead to inconsistent state estimates as the estimator does not obey a fundamental property of the physical system. To address this issue, we use a novel uncertainty representation to derive a Right Invariant error extended Kalman filter (RIEKF-VINS) that preserves this invariance property. RIEKF-VINS is then adapted to the multistate constraint Kalman filter framework to obtain a consistent state estimator. Both Monte Carlo simulations and real-world experiments are used to validate the proposed method.