Abstract:Recently, video object segmentation (VOS) networks typically use memory-based methods: for each query frame, the mask is predicted by space-time matching to memory frames. Despite these methods having superior performance, they suffer from two issues: 1) Challenging data can destroy the space-time coherence between adjacent video frames. 2) Pixel-level matching will lead to undesired mismatching caused by the noises or distractors. To address the aforementioned issues, we first propose to generate an auxiliary frame between adjacent frames, serving as an implicit short-temporal reference for the query one. Next, we learn a prototype for each video object and prototype-level matching can be implemented between the query and memory. The experiment demonstrated that our network outperforms the state-of-the-art method on the DAVIS 2017, achieving a J&F score of 86.4%, and attains a competitive result 85.0% on YouTube VOS 2018. In addition, our network exhibits a high inference speed of 32+ FPS.
Abstract:High-quality point clouds have practical significance for point-based rendering, semantic understanding, and surface reconstruction. Upsampling sparse, noisy and nonuniform point clouds for a denser and more regular approximation of target objects is a desirable but challenging task. Most existing methods duplicate point features for upsampling, constraining the upsampling scales at a fixed rate. In this work, the flexible upsampling rates are achieved via edge vector based affine combinations, and a novel design of Edge Vector based Approximation for Flexible-scale Point clouds Upsampling (PU-EVA) is proposed. The edge vector based approximation encodes the neighboring connectivity via affine combinations based on edge vectors, and restricts the approximation error within the second-order term of Taylor's Expansion. The EVA upsampling decouples the upsampling scales with network architecture, achieving the flexible upsampling rates in one-time training. Qualitative and quantitative evaluations demonstrate that the proposed PU-EVA outperforms the state-of-the-art in terms of proximity-to-surface, distribution uniformity, and geometric details preservation.
Abstract:This paper presents a new method to solve the inverse kinematic (IK) problem in real-time on soft robots with highly non-linear deformation. The major challenge of efficiently computing IK for such robots is caused by the lack of analytical formulation for either forward or inverse kinematics. To tackle this challenge, we employ neural-networks to learn both the mapping function of forward kinematics and also the Jacobian of this function. As a result, Jacobian-based iteration can be applied to solve the IK problem. A sim-to-real training transfer strategy is conducted to make this approach more practical. We first generate large amount of samples in a simulation environment for learning both the kinematic and the Jacobian networks of a soft robot design. After that, a sim-to-real layer of differentiable neurons is employed to map the results of simulation to the physical hardware, where this sim-to-real layer can be learned from very limited number of training samples generated on the hardware. The effectiveness of our approach has been verified on several pneumatic-driven soft robots in the tasks of trajectory following and interactive positioning.
Abstract:The health state assessment and remaining useful life (RUL) estimation play very important roles in prognostics and health management (PHM), owing to their abilities to reduce the maintenance and improve the safety of machines or equipment. However, they generally suffer from this problem of lacking prior knowledge to pre-define the exact failure thresholds for a machinery operating in a dynamic environment with a high level of uncertainty. In this case, dynamic thresholds depicted by the discrete states is a very attractive way to estimate the RUL of a dynamic machinery. Currently, there are only very few works considering the dynamic thresholds, and these studies adopted different algorithms to determine the discrete states and predict the continuous states separately, which largely increases the complexity of the learning process. In this paper, we propose a novel prognostics approach for RUL estimation of aero-engines with self-joint prediction of continuous and discrete states, wherein the prediction of continuous and discrete states are conducted simultaneously and dynamically within one learning framework.
Abstract:This paper presents a robust matrix elastic net based canonical correlation analysis (RMEN-CCA) for multiple view unsupervised learning problems, which emphasizes the combination of CCA and the robust matrix elastic net (RMEN) used as coupled feature selection. The RMEN-CCA leverages the strength of the RMEN to distill naturally meaningful features without any prior assumption and to measure effectively correlations between different 'views'. We can further employ directly the kernel trick to extend the RMEN-CCA to the kernel scenario with theoretical guarantees, which takes advantage of the kernel trick for highly complicated nonlinear feature learning. Rather than simply incorporating existing regularization minimization terms into CCA, this paper provides a new learning paradigm for CCA and is the first to derive a coupled feature selection based CCA algorithm that guarantees convergence. More significantly, for CCA, the newly-derived RMEN-CCA bridges the gap between measurement of relevance and coupled feature selection. Moreover, it is nontrivial to tackle directly the RMEN-CCA by previous optimization approaches derived from its sophisticated model architecture. Therefore, this paper further offers a bridge between a new optimization problem and an existing efficient iterative approach. As a consequence, the RMEN-CCA can overcome the limitation of CCA and address large-scale and streaming data problems. Experimental results on four popular competing datasets illustrate that the RMEN-CCA performs more effectively and efficiently than do state-of-the-art approaches.