Abstract:We consider the design of a new class of passive iFIR controllers given by the parallel action of an integrator and a finite impulse response filter. iFIRs are more expressive than PID controllers but retain their features and simplicity. The paper provides a model-free data-driven design for passive iFIR controllers based on virtual reference feedback tuning. Passivity is enforced through constrained optimization (three different formulations are discussed). The proposed design does not rely on large datasets or accurate plant models.
Abstract:Heterogeneous systems manipulation, i.e., manipulating rigid objects via deformable (soft) objects, is an emerging field that remains in its early stages of research. Existing works in this field suffer from limited action and operational space, poor generalization ability, and expensive development. To address these challenges, we propose a universally applicable and effective moving primitive, Iterative Grasp-Pull (IGP), and a sample-based framework, DeRi-IGP, to solve the heterogeneous system manipulation task. The DeRi-IGP framework uses local onboard robots' RGBD sensors to observe the environment, comprising a soft-rigid body system. It then uses this information to iteratively grasp and pull a soft body (e.g., rope) to move the attached rigid body to a desired location. We evaluate the effectiveness of our framework in solving various heterogeneous manipulation tasks and compare its performance with several state-of-the-art baselines. The result shows that DeRi-IGP outperforms other methods by a significant margin. In addition, we also demonstrate the advantage of the large operational space of IGP in the long-distance object acquisition task within both simulated and real environments.
Abstract:Anytime 3D human pose forecasting is crucial to synchronous real-world human-machine interaction, where the term ``anytime" corresponds to predicting human pose at any real-valued time step. However, to the best of our knowledge, all the existing methods in human pose forecasting perform predictions at preset, discrete time intervals. Therefore, we introduce AnyPose, a lightweight continuous-time neural architecture that models human behavior dynamics with neural ordinary differential equations. We validate our framework on the Human3.6M, AMASS, and 3DPW dataset and conduct a series of comprehensive analyses towards comparison with existing methods and the intersection of human pose and neural ordinary differential equations. Our results demonstrate that AnyPose exhibits high-performance accuracy in predicting future poses and takes significantly lower computational time than traditional methods in solving anytime prediction tasks.
Abstract:The robot exploration task has been widely studied with applications spanning from novel environment mapping to item delivery. For some time-critical tasks, such as rescue catastrophes, the agent is required to explore as efficiently as possible. Recently, Visit Frequency-based map representation achieved great success in such scenarios by discouraging repetitive visits with a frequency-based penalty. However, its relatively large size and single-agent settings hinder its further development. In this context, we propose Integrated Visit Frequency Map, which encodes identical information as Visit Frequency Map with a more compact size, and a visit frequency-based multi-agent information exchange and control scheme that is able to accommodate both representations. Through tests in diverse settings, the results indicate our proposed methods can achieve a comparable level of performance of VFM with lower bandwidth requirements and generalize well to different multi-agent setups including real-world environments.
Abstract:Recent research efforts have yielded significant advancements in manipulating objects under homogeneous settings where the robot is required to either manipulate rigid or deformable (soft) objects. However, the manipulation under heterogeneous setups that involve both deformable and rigid objects remains an unexplored area of research. Such setups are common in various scenarios that involve the transportation of heavy objects via ropes, e.g., on factory floors, at disaster sites, and in forestry. To address this challenge, we introduce DeRi-Bot, the first framework that enables the collaborative manipulation of rigid objects with deformable objects. Our framework comprises an Action Prediction Network (APN) and a Configuration Prediction Network (CPN) to model the complex pattern and stochasticity of soft-rigid body systems. We demonstrate the effectiveness of DeRi-Bot in moving rigid objects to a target position with ropes connected to robotic arms. Furthermore, DeRi-Bot is a distributive method that can accommodate an arbitrary number of robots or human partners without reconfiguration or retraining. We evaluate our framework in both simulated and real-world environments and show that it achieves promising results with strong generalization across different types of objects and multi-agent settings, including human-robot collaboration.
Abstract:Crossing or Not-Crossing (C/NC) problem is important to autonomous vehicles (AVs) to safely interact with pedestrians. However, this problem setup ignores pedestrians walking along the direction of vehicles' movement (LONG). To enhance AVs' awareness of pedestrians behavior, we make the first step towards extending C/NC to C/NC/LONG problem and recognize them based on single body pose. In contrast, previous C/NC state classification work depend on multiple poses or contextual information. Our proposed shallow neural network classifier is able to recognize these three states within a very short time. We test our it on JAAD dataset and report average 81.23% accuracy. In order to further improve the classifier's performance, we introduce a computational-efficient method, action momentum optimizer (AMO), which correct prediction based on crossing behavior pattern. And our experiment shows that classifier perform at most 11.39% better on continuous pose test with the help of it. Furthermore, this model can cooperate with different sensors and algorithms that provide 2D pedestrian body pose so that it is able to work across multiple light and weather conditions. In addition, we have created extended annotations of pose for JAAD dataset, which will be publicly released soon
Abstract:We present a new method to localize a camera within a previously unseen environment perceived from an egocentric point of view. Although this is, in general, an ill-posed problem, humans can effortlessly and efficiently determine their relative location and orientation and navigate into a previously unseen environments, e.g., finding a specific item in a new grocery store. To enable such a capability, we design a new egocentric representation, which we call ECO (Egocentric COgnitive map). ECO is biologically inspired, by the cognitive map that allows human navigation, and it encodes the surrounding visual semantics with respect to both distance and orientation. ECO possesses three main properties: (1) reconfigurability: complex semantics and geometry is captured via the synthesis of atomic visual representations (e.g., image patch); (2) robustness: the visual semantics are registered in a geometrically consistent way (e.g., aligning with respect to the gravity vector, frontalizing, and rescaling to canonical depth), thus enabling us to learn meaningful atomic representations; (3) adaptability: a domain adaptation framework is designed to generalize the learned representation without manual calibration. As a proof-of-concept, we use ECO to localize a camera within real-world scenes---various grocery stores---and demonstrate performance improvements when compared to existing semantic localization approaches.