Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pengfei Xie

DexTOG: Learning Task-Oriented Dexterous Grasp with Language

Apr 06, 2025

Jieyi Zhang, Wenqiang Xu, Zhenjun Yu, Pengfei Xie, Tutian Tang, Cewu Lu

Abstract:This study introduces a novel language-guided diffusion-based learning framework, DexTOG, aimed at advancing the field of task-oriented grasping (TOG) with dexterous hands. Unlike existing methods that mainly focus on 2-finger grippers, this research addresses the complexities of dexterous manipulation, where the system must identify non-unique optimal grasp poses under specific task constraints, cater to multiple valid grasps, and search in a high degree-of-freedom configuration space in grasp planning. The proposed DexTOG includes a diffusion-based grasp pose generation model, DexDiffu, and a data engine to support the DexDiffu. By leveraging DexTOG, we also proposed a new dataset, DexTOG-80K, which was developed using a shadow robot hand to perform various tasks on 80 objects from 5 categories, showcasing the dexterity and multi-tasking capabilities of the robotic hand. This research not only presents a significant leap in dexterous TOG but also provides a comprehensive dataset and simulation validation, setting a new benchmark in robotic manipulation research.

* IEEE Robotics and Automation Letters, vol. 10, no. 2, pp. 995-1002, Feb. 2025

Via

Access Paper or Ask Questions

Dynamic Reconstruction of Hand-Object Interaction with Distributed Force-aware Contact Representation

Nov 14, 2024

Zhenjun Yu, Wenqiang Xu, Pengfei Xie, Yutong Li, Cewu Lu

Figure 1 for Dynamic Reconstruction of Hand-Object Interaction with Distributed Force-aware Contact Representation

Figure 2 for Dynamic Reconstruction of Hand-Object Interaction with Distributed Force-aware Contact Representation

Figure 3 for Dynamic Reconstruction of Hand-Object Interaction with Distributed Force-aware Contact Representation

Figure 4 for Dynamic Reconstruction of Hand-Object Interaction with Distributed Force-aware Contact Representation

Abstract:We present ViTaM-D, a novel visual-tactile framework for dynamic hand-object interaction reconstruction, integrating distributed tactile sensing for more accurate contact modeling. While existing methods focus primarily on visual inputs, they struggle with capturing detailed contact interactions such as object deformation. Our approach leverages distributed tactile sensors to address this limitation by introducing DF-Field. This distributed force-aware contact representation models both kinetic and potential energy in hand-object interaction. ViTaM-D first reconstructs hand-object interactions using a visual-only network, VDT-Net, and then refines contact details through a force-aware optimization (FO) process, enhancing object deformation modeling. To benchmark our approach, we introduce the HOT dataset, which features 600 sequences of hand-object interactions, including deformable objects, built in a high-precision simulation environment. Extensive experiments on both the DexYCB and HOT datasets demonstrate significant improvements in accuracy over previous state-of-the-art methods such as gSDF and HOTrack. Our results highlight the superior performance of ViTaM-D in both rigid and deformable object reconstruction, as well as the effectiveness of DF-Field in refining hand poses. This work offers a comprehensive solution to dynamic hand-object interaction reconstruction by seamlessly integrating visual and tactile data. Codes, models, and datasets will be available.

Via

Access Paper or Ask Questions

MS-MANO: Enabling Hand Pose Tracking with Biomechanical Constraints

Apr 16, 2024

Pengfei Xie, Wenqiang Xu, Tutian Tang, Zhenjun Yu, Cewu Lu

Abstract:This work proposes a novel learning framework for visual hand dynamics analysis that takes into account the physiological aspects of hand motion. The existing models, which are simplified joint-actuated systems, often produce unnatural motions. To address this, we integrate a musculoskeletal system with a learnable parametric hand model, MANO, to create a new model, MS-MANO. This model emulates the dynamics of muscles and tendons to drive the skeletal system, imposing physiologically realistic constraints on the resulting torque trajectories. We further propose a simulation-in-the-loop pose refinement framework, BioPR, that refines the initial estimated pose through a multi-layer perceptron (MLP) network. Our evaluation of the accuracy of MS-MANO and the efficacy of the BioPR is conducted in two separate parts. The accuracy of MS-MANO is compared with MyoSuite, while the efficacy of BioPR is benchmarked against two large-scale public datasets and two recent state-of-the-art methods. The results demonstrate that our approach consistently improves the baseline methods both quantitatively and qualitatively.

* 11 pages, 5 figures; CVPR 2024

Via

Access Paper or Ask Questions

Adaptive Multi-layer Contrastive Graph Neural Networks

Sep 29, 2021

Shuhao Shi, Pengfei Xie, Xu Luo, Kai Qiao, Linyuan Wang, Jian Chen, Bin Yan

Figure 1 for Adaptive Multi-layer Contrastive Graph Neural Networks

Figure 2 for Adaptive Multi-layer Contrastive Graph Neural Networks

Figure 3 for Adaptive Multi-layer Contrastive Graph Neural Networks

Figure 4 for Adaptive Multi-layer Contrastive Graph Neural Networks

Abstract:We present Adaptive Multi-layer Contrastive Graph Neural Networks (AMC-GNN), a self-supervised learning framework for Graph Neural Network, which learns feature representations of sample data without data labels. AMC-GNN generates two graph views by data augmentation and compares different layers' output embeddings of Graph Neural Network encoders to obtain feature representations, which could be used for downstream tasks. AMC-GNN could learn the importance weights of embeddings in different layers adaptively through the attention mechanism, and an auxiliary encoder is introduced to train graph contrastive encoders better. The accuracy is improved by maximizing the representation's consistency of positive pairs in the early layers and the final embedding space. Our experiments show that the results can be consistently improved by using the AMC-GNN framework, across four established graph benchmarks: Cora, Citeseer, Pubmed, DBLP citation network datasets, as well as four newly proposed datasets: Co-author-CS, Co-author-Physics, Amazon-Computers, Amazon-Photo.

* 16 pages,7 figures

Via

Access Paper or Ask Questions

Improving the Transferability of Adversarial Examples with New Iteration Framework and Input Dropout

Jun 22, 2021

Pengfei Xie, Linyuan Wang, Ruoxi Qin, Kai Qiao, Shuhao Shi, Guoen Hu, Bin Yan

Figure 1 for Improving the Transferability of Adversarial Examples with New Iteration Framework and Input Dropout

Figure 2 for Improving the Transferability of Adversarial Examples with New Iteration Framework and Input Dropout

Figure 3 for Improving the Transferability of Adversarial Examples with New Iteration Framework and Input Dropout

Figure 4 for Improving the Transferability of Adversarial Examples with New Iteration Framework and Input Dropout

Abstract:Deep neural networks(DNNs) is vulnerable to be attacked by adversarial examples. Black-box attack is the most threatening attack. At present, black-box attack methods mainly adopt gradient-based iterative attack methods, which usually limit the relationship between the iteration step size, the number of iterations, and the maximum perturbation. In this paper, we propose a new gradient iteration framework, which redefines the relationship between the above three. Under this framework, we easily improve the attack success rate of DI-TI-MIM. In addition, we propose a gradient iterative attack method based on input dropout, which can be well combined with our framework. We further propose a multi dropout rate version of this method. Experimental results show that our best method can achieve attack success rate of 96.2\% for defense model on average, which is higher than the state-of-the-art gradient-based attacks.

Via

Access Paper or Ask Questions

Seismic Inverse Modeling Method based on Generative Adversarial Network

Jun 08, 2021

Pengfei Xie, YanShu Yin, JiaGen Hou, Mei Chen, Lixin Wang

Figure 1 for Seismic Inverse Modeling Method based on Generative Adversarial Network

Figure 2 for Seismic Inverse Modeling Method based on Generative Adversarial Network

Figure 3 for Seismic Inverse Modeling Method based on Generative Adversarial Network

Figure 4 for Seismic Inverse Modeling Method based on Generative Adversarial Network

Abstract:Seismic inverse modeling is a common method in reservoir prediction and it plays a vital role in the exploration and development of oil and gas. Conventional seismic inversion method is difficult to combine with complicated and abstract knowledge on geological mode and its uncertainty is difficult to be assessed. The paper proposes an inversion modeling method based on GAN consistent with geology, well logs, seismic data. GAN is a the most promising generation model algorithm that extracts spatial structure and abstract features of training images. The trained GAN can reproduce the models with specific mode. In our test, 1000 models were generated in 1 second. Based on the trained GAN after assessment, the optimal result of models can be calculated through Bayesian inversion frame. Results show that inversion models conform to observation data and have a low uncertainty under the premise of fast generation. This seismic inverse modeling method increases the efficiency and quality of inversion iteration. It is worthy of studying and applying in fusion of seismic data and geological knowledge.

* 22 pages,13 figures

Via

Access Paper or Ask Questions