Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fangqiang Ding

Dexterous Manipulation through Imitation Learning: A Survey

Apr 04, 2025

Shan An, Ziyu Meng, Chao Tang, Yuning Zhou, Tengyu Liu, Fangqiang Ding, Shufang Zhang, Yao Mu, Ran Song, Wei Zhang(+2 more)

Abstract:Dexterous manipulation, which refers to the ability of a robotic hand or multi-fingered end-effector to skillfully control, reorient, and manipulate objects through precise, coordinated finger movements and adaptive force modulation, enables complex interactions similar to human hand dexterity. With recent advances in robotics and machine learning, there is a growing demand for these systems to operate in complex and unstructured environments. Traditional model-based approaches struggle to generalize across tasks and object variations due to the high-dimensionality and complex contact dynamics of dexterous manipulation. Although model-free methods such as reinforcement learning (RL) show promise, they require extensive training, large-scale interaction data, and carefully designed rewards for stability and effectiveness. Imitation learning (IL) offers an alternative by allowing robots to acquire dexterous manipulation skills directly from expert demonstrations, capturing fine-grained coordination and contact dynamics while bypassing the need for explicit modeling and large-scale trial-and-error. This survey provides an overview of dexterous manipulation methods based on imitation learning (IL), details recent advances, and addresses key challenges in the field. Additionally, it explores potential research directions to enhance IL-driven dexterous manipulation. Our goal is to offer researchers and practitioners a comprehensive introduction to this rapidly evolving domain.

* 22pages, 5 figures

Via

Access Paper or Ask Questions

RadarOcc: Robust 3D Occupancy Prediction with 4D Imaging Radar

May 22, 2024

Fangqiang Ding, Xiangyu Wen, Yunzhou Zhu, Yiming Li, Chris Xiaoxuan Lu

Abstract:3D occupancy-based perception pipeline has significantly advanced autonomous driving by capturing detailed scene descriptions and demonstrating strong generalizability across various object categories and shapes. Current methods predominantly rely on LiDAR or camera inputs for 3D occupancy prediction. These methods are susceptible to adverse weather conditions, limiting the all-weather deployment of self-driving cars. To improve perception robustness, we leverage the recent advances in automotive radars and introduce a novel approach that utilizes 4D imaging radar sensors for 3D occupancy prediction. Our method, RadarOcc, circumvents the limitations of sparse radar point clouds by directly processing the 4D radar tensor, thus preserving essential scene details. RadarOcc innovatively addresses the challenges associated with the voluminous and noisy 4D radar data by employing Doppler bins descriptors, sidelobe-aware spatial sparsification, and range-wise self-attention mechanisms. To minimize the interpolation errors associated with direct coordinate transformations, we also devise a spherical-based feature encoding followed by spherical-to-Cartesian feature aggregation. We benchmark various baseline methods based on distinct modalities on the public K-Radar dataset. The results demonstrate RadarOcc's state-of-the-art performance in radar-based 3D occupancy prediction and promising results even when compared with LiDAR- or camera-based methods. Additionally, we present qualitative evidence of the superior performance of 4D radar in adverse weather conditions and explore the impact of key pipeline components through ablation studies.

* 16 pages, 3 figures

Via

Access Paper or Ask Questions

ThermoHands: A Benchmark for 3D Hand Pose Estimation from Egocentric Thermal Image

Mar 14, 2024

Fangqiang Ding, Yunzhou Zhu, Xiangyu Wen, Chris Xiaoxuan Lu

Abstract:In this work, we present ThermoHands, a new benchmark for thermal image-based egocentric 3D hand pose estimation, aimed at overcoming challenges like varying lighting and obstructions (e.g., handwear). The benchmark includes a diverse dataset from 28 subjects performing hand-object and hand-virtual interactions, accurately annotated with 3D hand poses through an automated process. We introduce a bespoken baseline method, TheFormer, utilizing dual transformer modules for effective egocentric 3D hand pose estimation in thermal imagery. Our experimental results highlight TheFormer's leading performance and affirm thermal imaging's effectiveness in enabling robust 3D hand pose estimation in adverse conditions.

* 20 pages, 6 pages, 5 tables

Via

Access Paper or Ask Questions

Moving Object Detection and Tracking with 4D Radar Point Cloud

Sep 18, 2023

Zhijun Pan, Fangqiang Ding, Hantao Zhong, Chris Xiaoxuan Lu

Figure 1 for Moving Object Detection and Tracking with 4D Radar Point Cloud

Figure 2 for Moving Object Detection and Tracking with 4D Radar Point Cloud

Figure 3 for Moving Object Detection and Tracking with 4D Radar Point Cloud

Figure 4 for Moving Object Detection and Tracking with 4D Radar Point Cloud

Abstract:Mobile autonomy relies on the precise perception of dynamic environments. Robustly tracking moving objects in 3D world thus plays a pivotal role for applications like trajectory prediction, obstacle avoidance, and path planning. While most current methods utilize LiDARs or cameras for Multiple Object Tracking (MOT), the capabilities of 4D imaging radars remain largely unexplored. Recognizing the challenges posed by radar noise and point sparsity in 4D radar data, we introduce RaTrack, an innovative solution tailored for radar-based tracking. Bypassing the typical reliance on specific object types and 3D bounding boxes, our method focuses on motion segmentation and clustering, enriched by a motion estimation module. Evaluated on the View-of-Delft dataset, RaTrack showcases superior tracking precision of moving objects, largely surpassing the performance of the state of the art.

* 8 pages, 4 figures. Co-first authorship for Zhijun Pan, Fangqiang Ding and Hantao Zhong

Via

Access Paper or Ask Questions

milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing

Jul 03, 2023

Fangqiang Ding, Zhen Luo, Peijun Zhao, Chris Xiaoxuan Lu

Figure 1 for milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing

Figure 2 for milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing

Figure 3 for milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing

Figure 4 for milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing

Abstract:Approaching the era of ubiquitous computing, human motion sensing plays a crucial role in smart systems for decision making, user interaction, and personalized services. Extensive research has been conducted on human tracking, pose estimation, gesture recognition, and activity recognition, which are predominantly based on cameras in traditional methods. However, the intrusive nature of cameras limits their use in smart home applications. To address this, mmWave radars have gained popularity due to their privacy-friendly features. In this work, we propose \textit{milliFlow}, a novel deep learning method for scene flow estimation as a complementary motion information for mmWave point cloud, serving as an intermediate level of features and directly benefiting downstream human motion sensing tasks. Experimental results demonstrate the superior performance of our method with an average 3D endpoint error of 4.6cm, significantly surpassing the competing approaches. Furthermore, by incorporating scene flow information, we achieve remarkable improvements in human activity recognition, human parsing, and human body part tracking. To foster further research in this area, we provide our codebase and dataset for open access.

* 15 pages, 8 figures

Via

Access Paper or Ask Questions

Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision

Mar 17, 2023

Fangqiang Ding, Andras Palffy, Dariu M. Gavrila, Chris Xiaoxuan Lu

Abstract:This work proposes a novel approach to 4D radar-based scene flow estimation via cross-modal learning. Our approach is motivated by the co-located sensing redundancy in modern autonomous vehicles. Such redundancy implicitly provides various forms of supervision cues to the radar scene flow estimation. Specifically, we introduce a multi-task model architecture for the identified cross-modal learning problem and propose loss functions to opportunistically engage scene flow estimation using multiple cross-modal constraints for effective model training. Extensive experiments show the state-of-the-art performance of our method and demonstrate the effectiveness of cross-modal supervised learning to infer more accurate 4D radar scene flow. We also show its usefulness to two subtasks - motion segmentation and ego-motion estimation. Our source code will be available on https://github.com/Toytiny/CMFlow.

* 10 pages, 7 figures. Accepted by CVPR 2023. See our code at https://github.com/Toytiny/CMFlow. Supplementary materials can be found at https://drive.google.com/file/d/1Iewcqnjzecge2ePBM8k2tg-85LX5xs3N/view

Via

Access Paper or Ask Questions

Ad2Attack: Adaptive Adversarial Attack on Real-Time UAV Tracking

Mar 03, 2022

Changhong Fu, Sihang Li, Xinnan Yuan, Junjie Ye, Ziang Cao, Fangqiang Ding

Figure 1 for Ad2Attack: Adaptive Adversarial Attack on Real-Time UAV Tracking

Figure 2 for Ad2Attack: Adaptive Adversarial Attack on Real-Time UAV Tracking

Figure 3 for Ad2Attack: Adaptive Adversarial Attack on Real-Time UAV Tracking

Figure 4 for Ad2Attack: Adaptive Adversarial Attack on Real-Time UAV Tracking

Abstract:Visual tracking is adopted to extensive unmanned aerial vehicle (UAV)-related applications, which leads to a highly demanding requirement on the robustness of UAV trackers. However, adding imperceptible perturbations can easily fool the tracker and cause tracking failures. This risk is often overlooked and rarely researched at present. Therefore, to help increase awareness of the potential risk and the robustness of UAV tracking, this work proposes a novel adaptive adversarial attack approach, i.e., Ad$^2$Attack, against UAV object tracking. Specifically, adversarial examples are generated online during the resampling of the search patch image, which leads trackers to lose the target in the following frames. Ad$^2$Attack is composed of a direct downsampling module and a super-resolution upsampling module with adaptive stages. A novel optimization function is proposed for balancing the imperceptibility and efficiency of the attack. Comprehensive experiments on several well-known benchmarks and real-world conditions show the effectiveness of our attack method, which dramatically reduces the performance of the most advanced Siamese trackers.

* 7 pages, 7 figures, accepted by ICRA 2022

Via

Access Paper or Ask Questions

Self-Supervised Scene Flow Estimation with 4D Automotive Radar

Mar 02, 2022

Fangqiang Ding, Zhijun Pan, Yimin Deng, Jianning Deng, Chris Xiaoxuan Lu

Figure 1 for Self-Supervised Scene Flow Estimation with 4D Automotive Radar

Figure 2 for Self-Supervised Scene Flow Estimation with 4D Automotive Radar

Figure 3 for Self-Supervised Scene Flow Estimation with 4D Automotive Radar

Figure 4 for Self-Supervised Scene Flow Estimation with 4D Automotive Radar

Abstract:Scene flow allows autonomous vehicles to reason about the arbitrary motion of multiple independent objects which is the key to long-term mobile autonomy. While estimating the scene flow from LiDAR has progressed recently, it remains largely unknown how to estimate the scene flow from a 4D radar - an increasingly popular automotive sensor for its robustness against adverse weather and lighting conditions. Compared with the LiDAR point clouds, radar data are drastically sparser, noisier and in much lower resolution. Annotated datasets for radar scene flow are also in absence and costly to acquire in the real world. These factors jointly pose the radar scene flow estimation as a challenging problem. This work aims to address the above challenges and estimate scene flow from 4D radar point clouds by leveraging self-supervised learning. A robust scene flow estimation architecture and three novel losses are bespoken designed to cope with intractable radar data. Real-world experimental results validate that our method is able to robustly estimate the radar scene flow in the wild and effectively supports the downstream task of motion segmentation.

* 8 pages, 6 figures, submitted to IEEE Robotics and Automation Letters (RA-L) with IROS 2022 option

Via

Access Paper or Ask Questions

Mutation Sensitive Correlation Filter for Real-Time UAV Tracking with Adaptive Hybrid Label

Jun 15, 2021

Guangze Zheng, Changhong Fu, Junjie Ye, Fuling Lin, Fangqiang Ding

Figure 1 for Mutation Sensitive Correlation Filter for Real-Time UAV Tracking with Adaptive Hybrid Label

Figure 2 for Mutation Sensitive Correlation Filter for Real-Time UAV Tracking with Adaptive Hybrid Label

Figure 3 for Mutation Sensitive Correlation Filter for Real-Time UAV Tracking with Adaptive Hybrid Label

Figure 4 for Mutation Sensitive Correlation Filter for Real-Time UAV Tracking with Adaptive Hybrid Label

Abstract:Unmanned aerial vehicle (UAV) based visual tracking has been confronted with numerous challenges, e.g., object motion and occlusion. These challenges generally introduce unexpected mutations of target appearance and result in tracking failure. However, prevalent discriminative correlation filter (DCF) based trackers are insensitive to target mutations due to a predefined label, which concentrates on merely the centre of the training region. Meanwhile, appearance mutations caused by occlusion or similar objects usually lead to the inevitable learning of wrong information. To cope with appearance mutations, this paper proposes a novel DCF-based method to enhance the sensitivity and resistance to mutations with an adaptive hybrid label, i.e., MSCF. The ideal label is optimized jointly with the correlation filter and remains temporal consistency. Besides, a novel measurement of mutations called mutation threat factor (MTF) is applied to correct the label dynamically. Considerable experiments are conducted on widely used UAV benchmarks. The results indicate that the performance of MSCF tracker surpasses other 26 state-of-the-art DCF-based and deep-based trackers. With a real-time speed of _38 frames/s, the proposed approach is sufficient for UAV tracking commissions.

* Accepted by ICRA 2021, Github: https://github.com/vision4robotics/MSCF-tracker

Via

Access Paper or Ask Questions

ADTrack: Target-Aware Dual Filter Learning for Real-Time Anti-Dark UAV Tracking

Jun 04, 2021

Bowen Li, Changhong Fu, Fangqiang Ding, Junjie Ye, Fuling Lin

Figure 1 for ADTrack: Target-Aware Dual Filter Learning for Real-Time Anti-Dark UAV Tracking

Figure 2 for ADTrack: Target-Aware Dual Filter Learning for Real-Time Anti-Dark UAV Tracking

Figure 3 for ADTrack: Target-Aware Dual Filter Learning for Real-Time Anti-Dark UAV Tracking

Figure 4 for ADTrack: Target-Aware Dual Filter Learning for Real-Time Anti-Dark UAV Tracking

Abstract:Prior correlation filter (CF)-based tracking methods for unmanned aerial vehicles (UAVs) have virtually focused on tracking in the daytime. However, when the night falls, the trackers will encounter more harsh scenes, which can easily lead to tracking failure. In this regard, this work proposes a novel tracker with anti-dark function (ADTrack). The proposed method integrates an efficient and effective low-light image enhancer into a CF-based tracker. Besides, a target-aware mask is simultaneously generated by virtue of image illumination variation. The target-aware mask can be applied to jointly train a target-focused filter that assists the context filter for robust tracking. Specifically, ADTrack adopts dual regression, where the context filter and the target-focused filter restrict each other for dual filter learning. Exhaustive experiments are conducted on typical dark sceneries benchmark, consisting of 37 typical night sequences from authoritative benchmarks, i.e., UAVDark, and our newly constructed benchmark UAVDark70. The results have shown that ADTrack favorably outperforms other state-of-the-art trackers and achieves a real-time speed of 34 frames/s on a single CPU, greatly extending robust UAV tracking to night scenes.

* 7 pages, 5 figures

Via

Access Paper or Ask Questions