Human-Robot Interfaces and physical Interaction Lab, Istituto Italiano di Tecnologia, Genoa, Italy
Abstract:Reinforcement learning (RL) has emerged as a promising paradigm in complex and continuous robotic tasks, however, safe exploration has been one of the main challenges, especially in contact-rich manipulation tasks in unstructured environments. Focusing on this issue, we propose SRL-VIC: a model-free safe RL framework combined with a variable impedance controller (VIC). Specifically, safety critic and recovery policy networks are pre-trained where safety critic evaluates the safety of the next action using a risk value before it is executed and the recovery policy suggests a corrective action if the risk value is high. Furthermore, the policies are updated online where the task policy not only achieves the task but also modulates the stiffness parameters to keep a safe and compliant profile. A set of experiments in contact-rich maze tasks demonstrate that our framework outperforms the baselines (without the recovery mechanism and without the VIC), yielding a good trade-off between efficient task accomplishment and safety guarantee. We show our policy trained on simulation can be deployed on a physical robot without fine-tuning, achieving successful task completion with robustness and generalization. The video is available at https://youtu.be/ksWXR3vByoQ.
Abstract:This paper proposes a hybrid optimization and learning method for impact-friendly catching objects at non-zero velocity. Through a constrained Quadratic Programming problem, the method generates optimal trajectories up to the contact point between the robot and the object to minimize their relative velocity and reduce the initial impact forces. Next, the generated trajectories are updated by Kernelized Movement Primitives which are based on human catching demonstrations to ensure a smooth transition around the catching point. In addition, the learned human variable stiffness (HVS) is sent to the robot's Cartesian impedance controller to absorb the post-impact forces and stabilize the catching position. Three experiments are conducted to compare our method with and without HVS against a fixed-position impedance controller (FP-IC). The results showed that the proposed methods outperform the FP-IC, while adding HVS yields better results for absorbing the post-impact forces.
Abstract:Epilepsy is one of the most common neurological diseases, characterized by transient and unprovoked events called epileptic seizures. Electroencephalogram (EEG) is an auxiliary method used to perform both the diagnosis and the monitoring of epilepsy. Given the unexpected nature of an epileptic seizure, its prediction would improve patient care, optimizing the quality of life and the treatment of epilepsy. Predicting an epileptic seizure implies the identification of two distinct states of EEG in a patient with epilepsy: the preictal and the interictal. In this paper, we developed two deep learning models called Temporal Multi-Channel Transformer (TMC-T) and Vision Transformer (TMC-ViT), adaptations of Transformer-based architectures for multi-channel temporal signals. Moreover, we accessed the impact of choosing different preictal duration, since its length is not a consensus among experts, and also evaluated how the sample size benefits each model. Our models are compared with fully connected, convolutional, and recurrent networks. The algorithms were patient-specific trained and evaluated on raw EEG signals from the CHB-MIT database. Experimental results and statistical validation demonstrated that our TMC-ViT model surpassed the CNN architecture, state-of-the-art in seizure prediction.
Abstract:The field of physical human-robot interaction has dramatically evolved in the last decades. As a result, the robotic system's requirements have become more challenging, including personalized behavior for different tasks and users. Various machine learning techniques have been proposed to give the robot such adaptability features. This paper proposes a model-based evolutionary optimization algorithm to tune the apparent impedance of a wrist rehabilitation device. We used passivity to define boundaries for the possible controller outcomes, limiting the shared autonomy of the robot and ensuring the coupled system stability. The experiment consists of a hardware-in-the-loop optimization and a one-degree-of-freedom robot used for wrist rehabilitation. Experimental results showed that the proposed technique could generate customized passive impedance controllers for three subjects. Furthermore, when compared with a constant impedance controller, the method suggested decreased in 20\% the root mean square of interaction torques while maintaining stability during optimization.