Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Cristian C. Beltran-Hernandez

SCU-Hand: Soft Conical Universal Robotic Hand for Scooping Granular Media from Containers of Various Sizes

May 07, 2025

Tomoya Takahashi, Cristian C. Beltran-Hernandez, Yuki Kuroda, Kazutoshi Tanaka, Masashi Hamaya, Yoshitaka Ushiku

Abstract:Automating small-scale experiments in materials science presents challenges due to the heterogeneous nature of experimental setups. This study introduces the SCU-Hand (Soft Conical Universal Robot Hand), a novel end-effector designed to automate the task of scooping powdered samples from various container sizes using a robotic arm. The SCU-Hand employs a flexible, conical structure that adapts to different container geometries through deformation, maintaining consistent contact without complex force sensing or machine learning-based control methods. Its reconfigurable mechanism allows for size adjustment, enabling efficient scooping from diverse container types. By combining soft robotics principles with a sheet-morphing design, our end-effector achieves high flexibility while retaining the necessary stiffness for effective powder manipulation. We detail the design principles, fabrication process, and experimental validation of the SCU-Hand. Experimental validation showed that the scooping capacity is about 20% higher than that of a commercial tool, with a scooping performance of more than 95% for containers of sizes between 67 mm to 110 mm. This research contributes to laboratory automation by offering a cost-effective, easily implementable solution for automating tasks such as materials synthesis and characterization processes.

* 2025 IEEE International Conference on Robotics and Automation (ICRA2025). Preprint. Accepted January 2025

Via

Access Paper or Ask Questions

KeyMPs: One-Shot Vision-Language Guided Motion Generation by Sequencing DMPs for Occlusion-Rich Tasks

Apr 14, 2025

Edgar Anarossi, Yuhwan Kwon, Hirotaka Tahara, Shohei Tanaka, Keisuke Shirai, Masashi Hamaya, Cristian C. Beltran-Hernandez, Atsushi Hashimoto, Takamitsu Matsubara

Abstract:Dynamic Movement Primitives (DMPs) provide a flexible framework wherein smooth robotic motions are encoded into modular parameters. However, they face challenges in integrating multimodal inputs commonly used in robotics like vision and language into their framework. To fully maximize DMPs' potential, enabling them to handle multimodal inputs is essential. In addition, we also aim to extend DMPs' capability to handle object-focused tasks requiring one-shot complex motion generation, as observation occlusion could easily happen mid-execution in such tasks (e.g., knife occlusion in cake icing, hand occlusion in dough kneading, etc.). A promising approach is to leverage Vision-Language Models (VLMs), which process multimodal data and can grasp high-level concepts. However, they typically lack enough knowledge and capabilities to directly infer low-level motion details and instead only serve as a bridge between high-level instructions and low-level control. To address this limitation, we propose Keyword Labeled Primitive Selection and Keypoint Pairs Generation Guided Movement Primitives (KeyMPs), a framework that combines VLMs with sequencing of DMPs. KeyMPs use VLMs' high-level reasoning capability to select a reference primitive through keyword labeled primitive selection and VLMs' spatial awareness to generate spatial scaling parameters used for sequencing DMPs by generalizing the overall motion through keypoint pairs generation, which together enable one-shot vision-language guided motion generation that aligns with the intent expressed in the multimodal input. We validate our approach through an occlusion-rich manipulation task, specifically object cutting experiments in both simulated and real-world environments, demonstrating superior performance over other DMP-based methods that integrate VLMs support.

* 17 pages, Submitted to IEEE Access April 9th 2025

Via

Access Paper or Ask Questions

Learning Diffusion Policies from Demonstrations For Compliant Contact-rich Manipulation

Oct 25, 2024

Malek Aburub, Cristian C. Beltran-Hernandez, Tatsuya Kamijo, Masashi Hamaya

Figure 1 for Learning Diffusion Policies from Demonstrations For Compliant Contact-rich Manipulation

Figure 2 for Learning Diffusion Policies from Demonstrations For Compliant Contact-rich Manipulation

Figure 3 for Learning Diffusion Policies from Demonstrations For Compliant Contact-rich Manipulation

Figure 4 for Learning Diffusion Policies from Demonstrations For Compliant Contact-rich Manipulation

Abstract:Robots hold great promise for performing repetitive or hazardous tasks, but achieving human-like dexterity, especially in contact-rich and dynamic environments, remains challenging. Rigid robots, which rely on position or velocity control, often struggle with maintaining stable contact and applying consistent force in force-intensive tasks. Learning from Demonstration has emerged as a solution, but tasks requiring intricate maneuvers, such as powder grinding, present unique difficulties. This paper introduces Diffusion Policies For Compliant Manipulation (DIPCOM), a novel diffusion-based framework designed for compliant control tasks. By leveraging generative diffusion models, we develop a policy that predicts Cartesian end-effector poses and adjusts arm stiffness to maintain the necessary force. Our approach enhances force control through multimodal distribution modeling, improves the integration of diffusion policies in compliance control, and extends our previous work by demonstrating its effectiveness in real-world tasks. We present a detailed comparison between our framework and existing methods, highlighting the advantages and best practices for deploying diffusion-based compliance control.

Via

Access Paper or Ask Questions

Robotic Object Insertion with a Soft Wrist through Sim-to-Real Privileged Training

Aug 30, 2024

Yuni Fuchioka, Cristian C. Beltran-Hernandez, Hai Nguyen, Masashi Hamaya

Figure 1 for Robotic Object Insertion with a Soft Wrist through Sim-to-Real Privileged Training

Figure 2 for Robotic Object Insertion with a Soft Wrist through Sim-to-Real Privileged Training

Figure 3 for Robotic Object Insertion with a Soft Wrist through Sim-to-Real Privileged Training

Figure 4 for Robotic Object Insertion with a Soft Wrist through Sim-to-Real Privileged Training

Abstract:This study addresses contact-rich object insertion tasks under unstructured environments using a robot with a soft wrist, enabling safe contact interactions. For the unstructured environments, we assume that there are uncertainties in object grasp and hole pose and that the soft wrist pose cannot be directly measured. Recent methods employ learning approaches and force/torque sensors for contact localization; however, they require data collection in the real world. This study proposes a sim-to-real approach using a privileged training strategy. This method has two steps. 1) The teacher policy is trained to complete the task with sensor inputs and ground truth privileged information such as the peg pose, and then 2) the student encoder is trained with data produced from teacher policy rollouts to estimate the privileged information from sensor history. We performed sim-to-real experiments under grasp and hole pose uncertainties. This resulted in 100\%, 95\%, and 80\% success rates for circular peg insertion with 0, +5, and -5 degree peg misalignments, respectively, and start positions randomly shifted $\pm$ 10 mm from a default position. Also, we tested the proposed method with a square peg that was never seen during training. Additional simulation evaluations revealed that using the privileged strategy improved success rates compared to training with only simulated sensor data. Our results demonstrate the advantage of using sim-to-real privileged training for soft robots, which has the potential to alleviate human engineering efforts for robotic assembly.

* This paper has been accepted at IROS 2024

Via

Access Paper or Ask Questions

Learning Variable Compliance Control From a Few Demonstrations for Bimanual Robot with Haptic Feedback Teleoperation System

Jun 21, 2024

Tatsuya Kamijo, Cristian C. Beltran-Hernandez, Masashi Hamaya

Figure 1 for Learning Variable Compliance Control From a Few Demonstrations for Bimanual Robot with Haptic Feedback Teleoperation System

Figure 2 for Learning Variable Compliance Control From a Few Demonstrations for Bimanual Robot with Haptic Feedback Teleoperation System

Figure 3 for Learning Variable Compliance Control From a Few Demonstrations for Bimanual Robot with Haptic Feedback Teleoperation System

Figure 4 for Learning Variable Compliance Control From a Few Demonstrations for Bimanual Robot with Haptic Feedback Teleoperation System

Abstract:Automating dexterous, contact-rich manipulation tasks using rigid robots is a significant challenge in robotics. Rigid robots, defined by their actuation through position commands, face issues of excessive contact forces due to their inability to adapt to contact with the environment, potentially causing damage. While compliance control schemes have been introduced to mitigate these issues by controlling forces via external sensors, they are hampered by the need for fine-tuning task-specific controller parameters. Learning from Demonstrations (LfD) offers an intuitive alternative, allowing robots to learn manipulations through observed actions. In this work, we introduce a novel system to enhance the teaching of dexterous, contact-rich manipulations to rigid robots. Our system is twofold: firstly, it incorporates a teleoperation interface utilizing Virtual Reality (VR) controllers, designed to provide an intuitive and cost-effective method for task demonstration with haptic feedback. Secondly, we present Comp-ACT (Compliance Control via Action Chunking with Transformers), a method that leverages the demonstrations to learn variable compliance control from a few demonstrations. Our methods have been validated across various complex contact-rich manipulation tasks using single-arm and bimanual robot setups in simulated and real-world environments, demonstrating the effectiveness of our system in teaching robots dexterous manipulations with enhanced adaptability and safety.

Via

Access Paper or Ask Questions

SliceIt! -- A Dual Simulator Framework for Learning Robot Food Slicing

Apr 03, 2024

Cristian C. Beltran-Hernandez, Nicolas Erbetti, Masashi Hamaya

Abstract:Cooking robots can enhance the home experience by reducing the burden of daily chores. However, these robots must perform their tasks dexterously and safely in shared human environments, especially when handling dangerous tools such as kitchen knives. This study focuses on enabling a robot to autonomously and safely learn food-cutting tasks. More specifically, our goal is to enable a collaborative robot or industrial robot arm to perform food-slicing tasks by adapting to varying material properties using compliance control. Our approach involves using Reinforcement Learning (RL) to train a robot to compliantly manipulate a knife, by reducing the contact forces exerted by the food items and by the cutting board. However, training the robot in the real world can be inefficient, and dangerous, and result in a lot of food waste. Therefore, we proposed SliceIt!, a framework for safely and efficiently learning robot food-slicing tasks in simulation. Following a real2sim2real approach, our framework consists of collecting a few real food slicing data, calibrating our dual simulation environment (a high-fidelity cutting simulator and a robotic simulator), learning compliant control policies on the calibrated simulation environment, and finally, deploying the policies on the real robot.

* Accepted to ICRA 2024

Via

Access Paper or Ask Questions

Symmetry-aware Reinforcement Learning for Robotic Assembly under Partial Observability with a Soft Wrist

Feb 28, 2024

Hai Nguyen, Tadashi Kozuno, Cristian C. Beltran-Hernandez, Masashi Hamaya

Abstract:This study tackles the representative yet challenging contact-rich peg-in-hole task of robotic assembly, using a soft wrist that can operate more safely and tolerate lower-frequency control signals than a rigid one. Previous studies often use a fully observable formulation, requiring external setups or estimators for the peg-to-hole pose. In contrast, we use a partially observable formulation and deep reinforcement learning from demonstrations to learn a memory-based agent that acts purely on haptic and proprioceptive signals. Moreover, previous works do not incorporate potential domain symmetry and thus must search for solutions in a bigger space. Instead, we propose to leverage the symmetry for sample efficiency by augmenting the training data and constructing auxiliary losses to force the agent to adhere to the symmetry. Results in simulation with five different symmetric peg shapes show that our proposed agent can be comparable to or even outperform a state-based agent. In particular, the sample efficiency also allows us to learn directly on the real robot within 3 hours.

* Accepted at ICRA-2024

Via

Access Paper or Ask Questions

Vision-Language Interpreter for Robot Task Planning

Nov 02, 2023

Keisuke Shirai, Cristian C. Beltran-Hernandez, Masashi Hamaya, Atsushi Hashimoto, Shohei Tanaka, Kento Kawaharazuka, Kazutoshi Tanaka, Yoshitaka Ushiku, Shinsuke Mori

Abstract:Large language models (LLMs) are accelerating the development of language-guided robot planners. Meanwhile, symbolic planners offer the advantage of interpretability. This paper proposes a new task that bridges these two trends, namely, multimodal planning problem specification. The aim is to generate a problem description (PD), a machine-readable file used by the planners to find a plan. By generating PDs from language instruction and scene observation, we can drive symbolic planners in a language-guided framework. We propose a Vision-Language Interpreter (ViLaIn), a new framework that generates PDs using state-of-the-art LLM and vision-language models. ViLaIn can refine generated PDs via error message feedback from the symbolic planner. Our aim is to answer the question: How accurately can ViLaIn and the symbolic planner generate valid robot plans? To evaluate ViLaIn, we introduce a novel dataset called the problem description generation (ProDG) dataset. The framework is evaluated with four new evaluation metrics. Experimental results show that ViLaIn can generate syntactically correct problems with more than 99% accuracy and valid plans with more than 58% accuracy.

Via

Access Paper or Ask Questions

Accelerating Robot Learning of Contact-Rich Manipulations: A Curriculum Learning Study

Apr 28, 2022

Cristian C. Beltran-Hernandez, Damien Petit, Ixchel G. Ramirez-Alpizar, Kensuke Harada

Figure 1 for Accelerating Robot Learning of Contact-Rich Manipulations: A Curriculum Learning Study

Figure 2 for Accelerating Robot Learning of Contact-Rich Manipulations: A Curriculum Learning Study

Figure 3 for Accelerating Robot Learning of Contact-Rich Manipulations: A Curriculum Learning Study

Figure 4 for Accelerating Robot Learning of Contact-Rich Manipulations: A Curriculum Learning Study

Abstract:The Reinforcement Learning (RL) paradigm has been an essential tool for automating robotic tasks. Despite the advances in RL, it is still not widely adopted in the industry due to the need for an expensive large amount of robot interaction with its environment. Curriculum Learning (CL) has been proposed to expedite learning. However, most research works have been only evaluated in simulated environments, from video games to robotic toy tasks. This paper presents a study for accelerating robot learning of contact-rich manipulation tasks based on Curriculum Learning combined with Domain Randomization (DR). We tackle complex industrial assembly tasks with position-controlled robots, such as insertion tasks. We compare different curricula designs and sampling approaches for DR. Based on this study, we propose a method that significantly outperforms previous work, which uses DR only (No CL is used), with less than a fifth of the training time (samples). Results also show that even when training only in simulation with toy tasks, our method can learn policies that can be transferred to the real-world robot. The learned policies achieved success rates of up to 86\% on real-world complex industrial insertion tasks (with tolerances of $\pm 0.01~mm$) not seen during the training.

* 12 pages, 11 figures, 4 tables, in journal review. Corresponding author: Cristian C. Beltran-Hernandez

Via

Access Paper or Ask Questions

Robotic Imitation of Human Assembly Skills Using Hybrid Trajectory and Force Learning

Mar 10, 2021

Yan Wang, Cristian C. Beltran-Hernandez, Weiwei Wan, Kensuke Harada

Figure 1 for Robotic Imitation of Human Assembly Skills Using Hybrid Trajectory and Force Learning

Figure 2 for Robotic Imitation of Human Assembly Skills Using Hybrid Trajectory and Force Learning

Figure 3 for Robotic Imitation of Human Assembly Skills Using Hybrid Trajectory and Force Learning

Figure 4 for Robotic Imitation of Human Assembly Skills Using Hybrid Trajectory and Force Learning

Abstract:Robotic assembly tasks involve complex and low-clearance insertion trajectories with varying contact forces at different stages. While the nominal motion trajectory can be easily obtained from human demonstrations through kinesthetic teaching, teleoperation, simulation, among other methods, the force profile is harder to obtain especially when a real robot is unavailable. It is difficult to obtain a realistic force profile in simulation even with physics engines. Such simulated force profiles tend to be unsuitable for the actual robotic assembly due to the reality gap and uncertainty in the assembly process. To address this problem, we present a combined learning-based framework to imitate human assembly skills through hybrid trajectory learning and force learning. The main contribution of this work is the development of a framework that combines hierarchical imitation learning, to learn the nominal motion trajectory, with a reinforcement learning-based force control scheme to learn an optimal force control policy, that can satisfy the nominal trajectory while adapting to the force requirement of the assembly task. To further improve the imitation learning part, we develop a hierarchical architecture, following the idea of goal-conditioned imitation learning, to generate the trajectory learning policy on the \textit{skill} level offline. Through experimental validations, we corroborate that the proposed learning-based framework is robust to uncertainty in the assembly task, can generate high-quality trajectories, and can find suitable force control policies, which adapt to the task's force requirements more efficiently.

Via

Access Paper or Ask Questions