Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Haoyu Jia

Dexterous grasp data augmentation based on grasp synthesis with fingertip workspace cloud and contact-aware sampling

Mar 17, 2026

Liqi Wu, Haoyu Jia, Kento Kawaharazuka, Hirokazu Ishida, Kei Okada

Abstract:Robotic grasping is a fundamental yet crucial component of robotic applications, as effective grasping often serves as the starting point for various tasks. With the rapid advancement of neural networks, data-driven approaches for robotic grasping have become mainstream. However, efficiently generating grasp datasets for training remains a bottleneck. This is compounded by the diverse structures of robotic hands, making the design of generalizable grasp generation methods even more complex. In this work, we propose a teleoperation-based framework to collect a small set of grasp pose demonstrations, which are augmented using FSG--a Fingertip-contact-aware Sampling-based Grasp generator. Based on the demonstrated grasp poses, we propose AutoWS, which automatically generates structured workspace clouds of robotic fingertips, embedding the hand structure information directly into the clouds to eliminate the need for inverse kinematics calculations. Experiments on grasping the YCB objects show that our method significantly outperforms existing approaches in both speed and valid pose generation rate. Our framework enables real-time grasp generation for hands with arbitrary structures and produces human-like grasps when combined with demonstrations, providing an efficient and robust data augmentation tool for data-driven grasp training.

* Accepted to Advanced Robotics, GitHub: https://github.com/W567/FSG, YouTube: https://youtu.be/rFCDl9SxSSA

Via

Access Paper or Ask Questions

Toward Formalizing LLM-Based Agent Designs through Structural Context Modeling and Semantic Dynamics Analysis

Feb 09, 2026

Haoyu Jia, Kento Kawaharazuka, Kei Okada

Abstract:Current research on large language model (LLM) agents is fragmented: discussions of conceptual frameworks and methodological principles are frequently intertwined with low-level implementation details, causing both readers and authors to lose track amid a proliferation of superficially distinct concepts. We argue that this fragmentation largely stems from the absence of an analyzable, self-consistent formal model that enables implementation-independent characterization and comparison of LLM agents. To address this gap, we propose the \texttt{Structural Context Model}, a formal model for analyzing and comparing LLM agents from the perspective of context structure. Building upon this foundation, we introduce two complementary components that together span the full lifecycle of LLM agent research and development: (1) a declarative implementation framework; and (2) a sustainable agent engineering workflow, \texttt{Semantic Dynamics Analysis}. The proposed workflow provides principled insights into agent mechanisms and supports rapid, systematic design iteration. We demonstrate the effectiveness of the complete framework on dynamic variants of the monkey-banana problem, where agents engineered using our approach achieve up to a 32 percentage points improvement in success rate on the most challenging setting.

Via

Access Paper or Ask Questions

Remote Life Support Robot Interface System for Global Task Planning and Local Action Expansion Using Foundation Models

Nov 15, 2024

Yoshiki Obinata, Haoyu Jia, Kento Kawaharazuka, Naoaki Kanazawa, Kei Okada

Figure 1 for Remote Life Support Robot Interface System for Global Task Planning and Local Action Expansion Using Foundation Models

Figure 2 for Remote Life Support Robot Interface System for Global Task Planning and Local Action Expansion Using Foundation Models

Figure 3 for Remote Life Support Robot Interface System for Global Task Planning and Local Action Expansion Using Foundation Models

Figure 4 for Remote Life Support Robot Interface System for Global Task Planning and Local Action Expansion Using Foundation Models

Abstract:Robot systems capable of executing tasks based on language instructions have been actively researched. It is challenging to convey uncertain information that can only be determined on-site with a single language instruction to the robot. In this study, we propose a system that includes ambiguous parts as template variables in language instructions to communicate the information to be collected and the options to be presented to the robot for predictable uncertain events. This study implements prompt generation for each robot action function based on template variables to collect information, and a feedback system for presenting and selecting options based on template variables for user-to-robot communication. The effectiveness of the proposed system was demonstrated through its application to real-life support tasks performed by the robot.

* Accepted to 2024 IEEE-RAS International Conference on Humanoids Robots (Humanoids 2024)

Via

Access Paper or Ask Questions