Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kai Du

Embedding high-resolution touch across robotic hands enables adaptive human-like grasping

Dec 19, 2024

Zihang Zhao, Wanlin Li, Yuyang Li, Tengyu Liu, Boren Li, Meng Wang, Kai Du, Hangxin Liu, Yixin Zhu, Qining Wang(+2 more)

Figure 1 for Embedding high-resolution touch across robotic hands enables adaptive human-like grasping

Figure 2 for Embedding high-resolution touch across robotic hands enables adaptive human-like grasping

Figure 3 for Embedding high-resolution touch across robotic hands enables adaptive human-like grasping

Figure 4 for Embedding high-resolution touch across robotic hands enables adaptive human-like grasping

Abstract:Developing robotic hands that adapt to real-world dynamics remains a fundamental challenge in robotics and machine intelligence. Despite significant advances in replicating human hand kinematics and control algorithms, robotic systems still struggle to match human capabilities in dynamic environments, primarily due to inadequate tactile feedback. To bridge this gap, we present F-TAC Hand, a biomimetic hand featuring high-resolution tactile sensing (0.1mm spatial resolution) across 70% of its surface area. Through optimized hand design, we overcome traditional challenges in integrating high-resolution tactile sensors while preserving the full range of motion. The hand, powered by our generative algorithm that synthesizes human-like hand configurations, demonstrates robust grasping capabilities in dynamic real-world conditions. Extensive evaluation across 600 real-world trials demonstrates that this tactile-embodied system significantly outperforms non-tactile alternatives in complex manipulation tasks (p<0.0001). These results provide empirical evidence for the critical role of rich tactile embodiment in developing advanced robotic intelligence, offering new perspectives on the relationship between physical sensing capabilities and intelligent behavior.

Via

Access Paper or Ask Questions

DeepSPoC: A Deep Learning-Based PDE Solver Governed by Sequential Propagation of Chaos

Aug 29, 2024

Kai Du, Yongle Xie, Tao Zhou, Yuancheng Zhou

Figure 1 for DeepSPoC: A Deep Learning-Based PDE Solver Governed by Sequential Propagation of Chaos

Figure 2 for DeepSPoC: A Deep Learning-Based PDE Solver Governed by Sequential Propagation of Chaos

Figure 3 for DeepSPoC: A Deep Learning-Based PDE Solver Governed by Sequential Propagation of Chaos

Figure 4 for DeepSPoC: A Deep Learning-Based PDE Solver Governed by Sequential Propagation of Chaos

Abstract:Sequential propagation of chaos (SPoC) is a recently developed tool to solve mean-field stochastic differential equations and their related nonlinear Fokker-Planck equations. Based on the theory of SPoC, we present a new method (deepSPoC) that combines the interacting particle system of SPoC and deep learning. Under the framework of deepSPoC, two classes of frequently used deep models include fully connected neural networks and normalizing flows are considered. For high-dimensional problems, spatial adaptive method are designed to further improve the accuracy and efficiency of deepSPoC. We analysis the convergence of the framework of deepSPoC under some simplified conditions and also provide a posterior error estimation for the algorithm. Finally, we test our methods on a wide range of different types of mean-field equations.

Via

Access Paper or Ask Questions

Integration of cognitive tasks into artificial general intelligence test for large models

Feb 04, 2024

Youzhi Qu, Chen Wei, Penghui Du, Wenxin Che, Chi Zhang, Wanli Ouyang, Yatao Bian, Feiyang Xu, Bin Hu, Kai Du(+3 more)

Figure 1 for Integration of cognitive tasks into artificial general intelligence test for large models

Figure 2 for Integration of cognitive tasks into artificial general intelligence test for large models

Figure 3 for Integration of cognitive tasks into artificial general intelligence test for large models

Figure 4 for Integration of cognitive tasks into artificial general intelligence test for large models

Abstract:During the evolution of large models, performance evaluation is necessarily performed on the intermediate models to assess their capabilities, and on the well-trained model to ensure safety before practical application. However, current model evaluations mainly rely on specific tasks and datasets, lacking a united framework for assessing the multidimensional intelligence of large models. In this perspective, we advocate for a comprehensive framework of artificial general intelligence (AGI) test, aimed at fulfilling the testing needs of large language models and multi-modal large models with enhanced capabilities. The AGI test framework bridges cognitive science and natural language processing to encompass the full spectrum of intelligence facets, including crystallized intelligence, a reflection of amassed knowledge and experience; fluid intelligence, characterized by problem-solving and adaptive reasoning; social intelligence, signifying comprehension and adaptation within multifaceted social scenarios; and embodied intelligence, denoting the ability to interact with its physical environment. To assess the multidimensional intelligence of large models, the AGI test consists of a battery of well-designed cognitive tests adopted from human intelligence tests, and then naturally encapsulates into an immersive virtual community. We propose that the complexity of AGI testing tasks should increase commensurate with the advancements in large models. We underscore the necessity for the interpretation of test results to avoid false negatives and false positives. We believe that cognitive science-inspired AGI tests will effectively guide the targeted improvement of large models in specific dimensions of intelligence and accelerate the integration of large models into human society.

Via

Access Paper or Ask Questions

Counterfactual Graph Transformer for Traffic Flow Prediction

Aug 01, 2023

Ying Yang, Kai Du, Xingyuan Dai, Jianwu Fang

Abstract:Traffic flow prediction (TFP) is a fundamental problem of the Intelligent Transportation System (ITS), as it models the latent spatial-temporal dependency of traffic flow for potential congestion prediction. Recent graph-based models with multiple kinds of attention mechanisms have achieved promising performance. However, existing methods for traffic flow prediction tend to inherit the bias pattern from the dataset and lack interpretability. To this end, we propose a Counterfactual Graph Transformer (CGT) model with an instance-level explainer (e.g., finding the important subgraphs) specifically designed for TFP. We design a perturbation mask generator over input sensor features at the time dimension and the graph structure on the graph transformer module to obtain spatial and temporal counterfactual explanations. By searching the optimal perturbation masks on the input data feature and graph structures, we can obtain the concise and dominant data or graph edge links for the subsequent TFP task. After re-training the utilized graph transformer model after counterfactual perturbation, we can obtain improved and interpretable traffic flow prediction. Extensive results on three real-world public datasets show that CGT can produce reliable explanations and is promising for traffic flow prediction.

* accepted by ITSC 2023

Via

Access Paper or Ask Questions

UAV Obstacle Avoidance by Human-in-the-Loop Reinforcement in Arbitrary 3D Environment

Apr 07, 2023

Xuyang Li, Jianwu Fang, Kai Du, Kuizhi Mei, Jianru Xue

Abstract:This paper focuses on the continuous control of the unmanned aerial vehicle (UAV) based on a deep reinforcement learning method for a large-scale 3D complex environment. The purpose is to make the UAV reach any target point from a certain starting point, and the flying height and speed are variable during navigation. In this work, we propose a deep reinforcement learning (DRL)-based method combined with human-in-the-loop, which allows the UAV to avoid obstacles automatically during flying. We design multiple reward functions based on the relevant domain knowledge to guide UAV navigation. The role of human-in-the-loop is to dynamically change the reward function of the UAV in different situations to suit the obstacle avoidance of the UAV better. We verify the success rate and average step size on urban, rural, and forest scenarios, and the experimental results show that the proposed method can reduce the training convergence time and improve the efficiency and accuracy of navigation tasks. The code is available on the website https://github.com/Monnalo/UAV_navigation.

* accepted in CCC2023

Via

Access Paper or Ask Questions

Human Perception-based Evaluation Criterion for Ultra-high Resolution Cell Membrane Segmentation

Oct 16, 2020

Ruohua Shi, Wenyao Wang, Zhixuan Li, Liuyuan He, Kaiwen Sheng, Lei Ma, Kai Du, Tingting Jiang, Tiejun Huang

Figure 1 for Human Perception-based Evaluation Criterion for Ultra-high Resolution Cell Membrane Segmentation

Figure 2 for Human Perception-based Evaluation Criterion for Ultra-high Resolution Cell Membrane Segmentation

Figure 3 for Human Perception-based Evaluation Criterion for Ultra-high Resolution Cell Membrane Segmentation

Figure 4 for Human Perception-based Evaluation Criterion for Ultra-high Resolution Cell Membrane Segmentation

Abstract:Computer vision technology is widely used in biological and medical data analysis and understanding. However, there are still two major bottlenecks in the field of cell membrane segmentation, which seriously hinder further research: lack of sufficient high-quality data and lack of suitable evaluation criteria. In order to solve these two problems, this paper first proposes an Ultra-high Resolution Image Segmentation dataset for the Cell membrane, called U-RISC, the largest annotated Electron Microscopy (EM) dataset for the Cell membrane with multiple iterative annotations and uncompressed high-resolution raw data. During the analysis process of the U-RISC, we found that the current popular segmentation evaluation criteria are inconsistent with human perception. This interesting phenomenon is confirmed by a subjective experiment involving twenty people. Furthermore, to resolve this inconsistency, we propose a new evaluation criterion called Perceptual Hausdorff Distance (PHD) to measure the quality of cell membrane segmentation results. Detailed performance comparison and discussion of classic segmentation methods along with two iterative manual annotation results under existing evaluation criteria and PHD is given.

* submitted to ICLR 2021

Via

Access Paper or Ask Questions

A simple blind-denoising filter inspired by electrically coupled photoreceptors in the retina

Aug 27, 2018

Yang Yue, Liuyuan He, Gan He, Jian. K. Liu, Kai Du, Yonghong Tian, Tiejun Huang

Figure 1 for A simple blind-denoising filter inspired by electrically coupled photoreceptors in the retina

Figure 2 for A simple blind-denoising filter inspired by electrically coupled photoreceptors in the retina

Figure 3 for A simple blind-denoising filter inspired by electrically coupled photoreceptors in the retina

Figure 4 for A simple blind-denoising filter inspired by electrically coupled photoreceptors in the retina

Abstract:Photoreceptors in the retina are coupled by electrical synapses called "gap junctions". It has long been established that gap junctions increase the signal-to-noise ratio of photoreceptors. Inspired by electrically coupled photoreceptors, we introduced a simple filter, the PR-filter, with only one variable. On BSD68 dataset, PR-filter showed outstanding performance in SSIM during blind denoising tasks. It also significantly improved the performance of state-of-the-art convolutional neural network blind denosing on non-Gaussian noise. The performance of keeping more details might be attributed to small receptive field of the photoreceptors.

* 16 pages, 8 figures, 9 tables, Submitted to NIPS 2018

Via

Access Paper or Ask Questions