Graduate School of Science and Engineering, Saitama University, Saitama, Japan
Abstract:Conventional methods of imitation learning for variable-speed motion have difficulty extrapolating speeds because they rely on learning models running at a constant sampling frequency. This study proposes variable-frequency imitation learning (VFIL), a novel method for imitation learning with learning models trained to run at variable sampling frequencies along with the desired speeds of motion. The experimental results showed that the proposed method improved the velocity-wise accuracy along both the interpolated and extrapolated frequency labels, in addition to a 12.5 % increase in the overall success rate.
Abstract:In recent years, imitation learning using neural networks has enabled robots to perform flexible tasks. However, since neural networks operate in a feedforward structure, they do not possess a mechanism to compensate for output errors. To address this limitation, we developed a feedback mechanism to correct these errors. By employing a hierarchical structure for neural networks comprising lower and upper layers, the lower layer was controlled to follow the upper layer. Additionally, using a multi-layer perceptron in the lower layer, which lacks an internal state, enhanced the error feedback. In the character-writing task, this model demonstrated improved accuracy in writing previously untrained characters. In the character-writing task, this model demonstrated improved accuracy in writing previously untrained characters. Through autonomous control with error feedback, we confirmed that the lower layer could effectively track the output of the upper layer. This study represents a promising step toward integrating neural networks with control theories.
Abstract:Recent advancements in imitation learning, particularly with the integration of LLM techniques, are set to significantly improve robots' dexterity and adaptability. In this study, we propose using Mamba, a state-of-the-art architecture with potential applications in LLMs, for robotic imitation learning, highlighting its ability to function as an encoder that effectively captures contextual information. By reducing the dimensionality of the state space, Mamba operates similarly to an autoencoder. It effectively compresses the sequential information into state variables while preserving the essential temporal dynamics necessary for accurate motion prediction. Experimental results in tasks such as cup placing and case loading demonstrate that despite exhibiting higher estimation errors, Mamba achieves superior success rates compared to Transformers in practical task execution. This performance is attributed to Mamba's structure, which encompasses the state space model. Additionally, the study investigates Mamba's capacity to serve as a real-time motion generator with a limited amount of training data.
Abstract:Imitation learning enables robots to learn and replicate human behavior from training data. Recent advances in machine learning enable end-to-end learning approaches that directly process high-dimensional observation data, such as images. However, these approaches face a critical challenge when processing data from multiple modalities, inadvertently ignoring data with a lower correlation to the desired output, especially when using short sampling periods. This paper presents a useful method to address this challenge, which amplifies the influence of data with a relatively low correlation to the output by inputting the data into each neural network layer. The proposed approach effectively incorporates diverse data sources into the learning process. Through experiments using a simple pick-and-place operation with raw images and joint information as input, significant improvements in success rates are demonstrated even when dealing with data from short sampling periods.
Abstract:Object grasping is an important ability required for various robot tasks. In particular, tasks that require precise force adjustments during operation, such as grasping an unknown object or using a grasped tool, are difficult for humans to program in advance. Recently, AI-based algorithms that can imitate human force skills have been actively explored as a solution. In particular, bilateral control-based imitation learning achieves human-level motion speeds with environmental adaptability, only requiring human demonstration and without programming. However, owing to hardware limitations, its grasping performance remains limited, and tasks that involves grasping various objects are yet to be achieved. Here, we developed a cross-structure hand to grasp various objects. We experimentally demonstrated that the integration of bilateral control-based imitation learning and the cross-structure hand is effective for grasping various objects and harnessing tools.
Abstract:In contact-rich tasks, setting the stiffness of the control system is a critical factor in its performance. Although the setting range can be extended by making the stiffness matrix asymmetric, its stability has not been proven. This study focuses on the stability of compliance control in a robot arm that deals with an asymmetric stiffness matrix. It discusses the convergence stability of the admittance control. The paper explains how to derive an asymmetric stiffness matrix and how to incorporate it into the admittance model. The authors also present simulation and experimental results that demonstrate the effectiveness of their proposed method.
Abstract:Compliance control is an increasingly employed technique used in the robotic field. It is known that various mechanical properties can be reproduced depending on the design of the stiffness matrix, but the design theory that takes advantage of this high degree of design freedom has not been elucidated. This paper, therefore, discusses the non-diagonal elements of the stiffness matrix. We proposed a design method according to the conditions required for achieving stable motion. Additionally, we analyzed the displacement induced by the non-diagonal elements in response to an external force and found that to obtain stable contact with a symmetric matrix, the matrix should be positive definite, i.e., all eigenvalues must be positive, however its parameter design is complicated. In this study, we focused on the use of asymmetric matrices in compliance control and showed that the design of eigenvalues can be simplified by using a triangular matrix. This approach expands the range of the stiffness design and enhances the ability of the compliance control to induce motion. We conducted experiments using the stiffness matrix and confirmed that assembly could be achieved without complicated trajectory planning.
Abstract:Robots are expected to replace menial tasks such as housework. Some of these tasks include nonprehensile manipulation performed without grasping objects. Nonprehensile manipulation is very difficult because it requires considering the dynamics of environments and objects. Therefore imitating complex behaviors requires a large number of human demonstrations. In this study, a self-supervised learning that considers dynamics to achieve variable speed for nonprehensile manipulation is proposed. The proposed method collects and fine-tunes only successful action data obtained during autonomous operations. By fine-tuning the successful data, the robot learns the dynamics among itself, its environment, and objects. We experimented with the task of scooping and transporting pancakes using the neural network model trained on 24 human-collected training data. The proposed method significantly improved the success rate from 40.2% to 85.7%, and succeeded the task more than 75% for other objects.
Abstract:Recently, motion generation by machine learning has been actively researched to automate various tasks. Imitation learning is one such method that learns motions from data collected in advance. However, executing long-term tasks remains challenging. Therefore, a novel framework for imitation learning is proposed to solve this problem. The proposed framework comprises upper and lower layers, where the upper layer model, whose timescale is long, and lower layer model, whose timescale is short, can be independently trained. In this model, the upper layer learns long-term task planning, and the lower layer learns motion primitives. The proposed method was experimentally compared to hierarchical RNN-based methods to validate its effectiveness. Consequently, the proposed method showed a success rate equal to or greater than that of conventional methods. In addition, the proposed method required less than 1/20 of the training time compared to conventional methods. Moreover, it succeeded in executing unlearned tasks by reusing the trained lower layer.
Abstract:This paper presents a novel interaction planning method that exploits impedance tuning techniques in response to environmental uncertainties and unpredictable conditions using haptic information only. The proposed algorithm plans the robot's trajectory based on the haptic interaction with the environment and adapts planning strategies as needed. Two approaches are considered: Exploration and Bouncing strategies. The Exploration strategy takes the actual motion of the robot into account in planning, while the Bouncing strategy exploits the forces and the motion vector of the robot. Moreover, self-tuning impedance is performed according to the planned trajectory to ensure stable contact and low contact forces. In order to show the performance of the proposed methodology, two experiments with a torque-controller robotic arm are carried out. The first considers a maze exploration without obstacles, whereas the second includes obstacles. The proposed method performance is analyzed and compared against previously proposed solutions in both cases. Experimental results demonstrate that: i) the robot can successfully plan its trajectory autonomously in the most feasible direction according to the interaction with the environment, and ii) a stable interaction with an unknown environment despite the uncertainties is achieved. Finally, a scalability demonstration is carried out to show the potential of the proposed method under multiple scenarios.