Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yinghao Cai

State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, China

SENIOR: Efficient Query Selection and Preference-Guided Exploration in Preference-based Reinforcement Learning

Jun 17, 2025

Hexian Ni, Tao Lu, Haoyuan Hu, Yinghao Cai, Shuo Wang

Abstract:Preference-based Reinforcement Learning (PbRL) methods provide a solution to avoid reward engineering by learning reward models based on human preferences. However, poor feedback- and sample- efficiency still remain the problems that hinder the application of PbRL. In this paper, we present a novel efficient query selection and preference-guided exploration method, called SENIOR, which could select the meaningful and easy-to-comparison behavior segment pairs to improve human feedback-efficiency and accelerate policy learning with the designed preference-guided intrinsic rewards. Our key idea is twofold: (1) We designed a Motion-Distinction-based Selection scheme (MDS). It selects segment pairs with apparent motion and different directions through kernel density estimation of states, which is more task-related and easy for human preference labeling; (2) We proposed a novel preference-guided exploration method (PGE). It encourages the exploration towards the states with high preference and low visits and continuously guides the agent achieving the valuable samples. The synergy between the two mechanisms could significantly accelerate the progress of reward and policy learning. Our experiments show that SENIOR outperforms other five existing methods in both human feedback-efficiency and policy convergence speed on six complex robot manipulation tasks from simulation and four real-worlds.

* 8 pages, 8 figures

Via

Access Paper or Ask Questions

CLTP: Contrastive Language-Tactile Pre-training for 3D Contact Geometry Understanding

May 13, 2025

Wenxuan Ma, Xiaoge Cao, Yixiang Zhang, Chaofan Zhang, Shaobo Yang, Peng Hao, Bin Fang, Yinghao Cai, Shaowei Cui, Shuo Wang

Abstract:Recent advancements in integrating tactile sensing with vision-language models (VLMs) have demonstrated remarkable potential for robotic multimodal perception. However, existing tactile descriptions remain limited to superficial attributes like texture, neglecting critical contact states essential for robotic manipulation. To bridge this gap, we propose CLTP, an intuitive and effective language tactile pretraining framework that aligns tactile 3D point clouds with natural language in various contact scenarios, thus enabling contact-state-aware tactile language understanding for contact-rich manipulation tasks. We first collect a novel dataset of 50k+ tactile 3D point cloud-language pairs, where descriptions explicitly capture multidimensional contact states (e.g., contact location, shape, and force) from the tactile sensor's perspective. CLTP leverages a pre-aligned and frozen vision-language feature space to bridge holistic textual and tactile modalities. Experiments validate its superiority in three downstream tasks: zero-shot 3D classification, contact state classification, and tactile 3D large language model (LLM) interaction. To the best of our knowledge, this is the first study to align tactile and language representations from the contact state perspective for manipulation tasks, providing great potential for tactile-language-action model learning. Code and datasets are open-sourced at https://sites.google.com/view/cltp/.

* 16 pages

Via

Access Paper or Ask Questions

NeuGrasp: Generalizable Neural Surface Reconstruction with Background Priors for Material-Agnostic Object Grasp Detection

Mar 05, 2025

Qingyu Fan, Yinghao Cai, Chao Li, Wenzhe He, Xudong Zheng, Tao Lu, Bin Liang, Shuo Wang

Abstract:Robotic grasping in scenes with transparent and specular objects presents great challenges for methods relying on accurate depth information. In this paper, we introduce NeuGrasp, a neural surface reconstruction method that leverages background priors for material-agnostic grasp detection. NeuGrasp integrates transformers and global prior volumes to aggregate multi-view features with spatial encoding, enabling robust surface reconstruction in narrow and sparse viewing conditions. By focusing on foreground objects through residual feature enhancement and refining spatial perception with an occupancy-prior volume, NeuGrasp excels in handling objects with transparent and specular surfaces. Extensive experiments in both simulated and real-world scenarios show that NeuGrasp outperforms state-of-the-art methods in grasping while maintaining comparable reconstruction quality. More details are available at https://neugrasp.github.io/.

* 7 pages, 5 figures. IEEE International Conference on Robotics and Automation (ICRA) 2025

Via

Access Paper or Ask Questions

Exploring Consistency in Graph Representations:from Graph Kernels to Graph Neural Networks

Oct 31, 2024

Xuyuan Liu, Yinghao Cai, Qihui Yang, Yujun Yan

Abstract:Graph Neural Networks (GNNs) have emerged as a dominant approach in graph representation learning, yet they often struggle to capture consistent similarity relationships among graphs. While graph kernel methods such as the Weisfeiler-Lehman subtree (WL-subtree) and Weisfeiler-Lehman optimal assignment (WLOA) kernels are effective in capturing similarity relationships, they rely heavily on predefined kernels and lack sufficient non-linearity for more complex data patterns. Our work aims to bridge the gap between neural network methods and kernel approaches by enabling GNNs to consistently capture relational structures in their learned representations. Given the analogy between the message-passing process of GNNs and WL algorithms, we thoroughly compare and analyze the properties of WL-subtree and WLOA kernels. We find that the similarities captured by WLOA at different iterations are asymptotically consistent, ensuring that similar graphs remain similar in subsequent iterations, thereby leading to superior performance over the WL-subtree kernel. Inspired by these findings, we conjecture that the consistency in the similarities of graph representations across GNN layers is crucial in capturing relational structures and enhancing graph classification performance. Thus, we propose a loss to enforce the similarity of graph representations to be consistent across different layers. Our empirical analysis verifies our conjecture and shows that our proposed consistency loss can significantly enhance graph classification performance across several GNN backbones on various datasets.

* NeurIPS 2024

Via

Access Paper or Ask Questions

Real-Time Marker Localization Learning for GelStereo Tactile Sensing

Nov 24, 2022

Shujuan Liu, Shaowei Cui, Chaofan Zhang, Yinghao Cai, Shuo Wang

Abstract:Visuotactile sensing technology is becoming more popular in tactile sensing, but the effectiveness of the existing marker detection localization methods remains to be further explored. Instead of contour-based blob detection, this paper presents a learning-based marker localization network for GelStereo visuotactile sensing called Marknet. Specifically, the Marknet presents a grid regression architecture to incorporate the distribution of the GelStereo markers. Furthermore, a marker rationality evaluator (MRE) is modelled to screen suitable prediction results. The experimental results show that the Marknet combined with MRE achieves 93.90% precision for irregular markers in contact areas, which outperforms the traditional contour-based blob detection method by a large margin of 42.32%. Meanwhile, the proposed learning-based marker localization method can achieve better real-time performance beyond the blob detection interface provided by the OpenCV library through GPU acceleration, which we believe will lead to considerable perceptual sensitivity gains in various robotic manipulation tasks.

Via

Access Paper or Ask Questions

Learning Symbolic Operators: A Neurosymbolic Solution for Autonomous Disassembly of Electric Vehicle Battery

Jun 07, 2022

Yidong Du, Wenshuo Wang, Zhigang Wang, Hua Yang, Haitao Wang, Yinghao Cai, Ming Chen

Figure 1 for Learning Symbolic Operators: A Neurosymbolic Solution for Autonomous Disassembly of Electric Vehicle Battery

Figure 2 for Learning Symbolic Operators: A Neurosymbolic Solution for Autonomous Disassembly of Electric Vehicle Battery

Figure 3 for Learning Symbolic Operators: A Neurosymbolic Solution for Autonomous Disassembly of Electric Vehicle Battery

Figure 4 for Learning Symbolic Operators: A Neurosymbolic Solution for Autonomous Disassembly of Electric Vehicle Battery

Abstract:The booming of electric vehicles demands efficient battery disassembly for recycling to be environment-friendly. Currently, battery disassembly is still primarily done by humans, probably assisted by robots, due to the unstructured environment and high uncertainties. It is highly desirable to design autonomous solutions to improve work efficiency and lower human risks in high voltage and toxic environments. This paper proposes a novel neurosymbolic method, which augments the traditional Variational Autoencoder (VAE) model to learn symbolic operators based on raw sensory inputs and their relationships. The symbolic operators include a probabilistic state symbol grounding model and a state transition matrix for predicting states after each execution to enable autonomous task and motion planning. At last, the method's feasibility is verified through test results.

Via

Access Paper or Ask Questions

ACDER: Augmented Curiosity-Driven Experience Replay

Nov 16, 2020

Boyao Li, Tao Lu, Jiayi Li, Ning Lu, Yinghao Cai, Shuo Wang

Figure 1 for ACDER: Augmented Curiosity-Driven Experience Replay

Figure 2 for ACDER: Augmented Curiosity-Driven Experience Replay

Figure 3 for ACDER: Augmented Curiosity-Driven Experience Replay

Figure 4 for ACDER: Augmented Curiosity-Driven Experience Replay

Abstract:Exploration in environments with sparse feedback remains a challenging research problem in reinforcement learning (RL). When the RL agent explores the environment randomly, it results in low exploration efficiency, especially in robotic manipulation tasks with high dimensional continuous state and action space. In this paper, we propose a novel method, called Augmented Curiosity-Driven Experience Replay (ACDER), which leverages (i) a new goal-oriented curiosity-driven exploration to encourage the agent to pursue novel and task-relevant states more purposefully and (ii) the dynamic initial states selection as an automatic exploratory curriculum to further improve the sample-efficiency. Our approach complements Hindsight Experience Replay (HER) by introducing a new way to pursue valuable states. Experiments conducted on four challenging robotic manipulation tasks with binary rewards, including Reach, Push, Pick&Place and Multi-step Push. The empirical results show that our proposed method significantly outperforms existing methods in the first three basic tasks and also achieves satisfactory performance in multi-step robotic task learning.

Via

Access Paper or Ask Questions

FameSVD: Fast and Memory-efficient Singular Value Decomposition

Jun 28, 2019

Xiaocan Li, Shuo Wang, Yinghao Cai

Figure 1 for FameSVD: Fast and Memory-efficient Singular Value Decomposition

Figure 2 for FameSVD: Fast and Memory-efficient Singular Value Decomposition

Figure 3 for FameSVD: Fast and Memory-efficient Singular Value Decomposition

Figure 4 for FameSVD: Fast and Memory-efficient Singular Value Decomposition

Abstract:We propose a novel algorithm to perform the Singular Value Decomposition (SVD) by leveraging the internal property of SVD. Due to the derivation being explored deterministically rather than stochastically, the convergence is guaranteed. Complexity analysis is also conducted. Our proposed SVD method outperforms classic algorithms with significant margin both in runtime and memory usage. Furthermore, we discuss the relationship between SVD and Principal Component Analysis (PCA). For those SVD or PCA algorithms that do not acquire all eigenvalues or cannot get them precisely, we utilize the matrix analysis knowledge to get the sum of all eigenvalues in order that cumulative explained variance criterion could be used in not-all-eigenvalues-are-known cases.

Via

Access Paper or Ask Questions

Localizing Discriminative Visual Landmarks for Place Recognition

Apr 14, 2019

Zhe Xin, Yinghao Cai, Tao Lu, Xiaoxia Xing, Shaojun Cai, Jixiang Zhang, Yiping Yang, Yanqing Wang

Figure 1 for Localizing Discriminative Visual Landmarks for Place Recognition

Figure 2 for Localizing Discriminative Visual Landmarks for Place Recognition

Figure 3 for Localizing Discriminative Visual Landmarks for Place Recognition

Figure 4 for Localizing Discriminative Visual Landmarks for Place Recognition

Abstract:We address the problem of visual place recognition with perceptual changes. The fundamental problem of visual place recognition is generating robust image representations which are not only insensitive to environmental changes but also distinguishable to different places. Taking advantage of the feature extraction ability of Convolutional Neural Networks (CNNs), we further investigate how to localize discriminative visual landmarks that positively contribute to the similarity measurement, such as buildings and vegetations. In particular, a Landmark Localization Network (LLN) is designed to indicate which regions of an image are used for discrimination. Detailed experiments are conducted on open source datasets with varied appearance and viewpoint changes. The proposed approach achieves superior performance against state-of-the-art methods.

* 7 pages, 8 figures, ICRA 2019

Via

Access Paper or Ask Questions

Hindsight Generative Adversarial Imitation Learning

Mar 19, 2019

Naijun Liu, Tao Lu, Yinghao Cai, Boyao Li, Shuo Wang

Figure 1 for Hindsight Generative Adversarial Imitation Learning

Figure 2 for Hindsight Generative Adversarial Imitation Learning

Figure 3 for Hindsight Generative Adversarial Imitation Learning

Figure 4 for Hindsight Generative Adversarial Imitation Learning

Abstract:Compared to reinforcement learning, imitation learning (IL) is a powerful paradigm for training agents to learn control policies efficiently from expert demonstrations. However, in most cases, obtaining demonstration data is costly and laborious, which poses a significant challenge in some scenarios. A promising alternative is to train agent learning skills via imitation learning without expert demonstrations, which, to some extent, would extremely expand imitation learning areas. To achieve such expectation, in this paper, we propose Hindsight Generative Adversarial Imitation Learning (HGAIL) algorithm, with the aim of achieving imitation learning satisfying no need of demonstrations. Combining hindsight idea with the generative adversarial imitation learning (GAIL) framework, we realize implementing imitation learning successfully in cases of expert demonstration data are not available. Experiments show that the proposed method can train policies showing comparable performance to current imitation learning methods. Further more, HGAIL essentially endows curriculum learning mechanism which is critical for learning policies.

Via

Access Paper or Ask Questions