Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiuping Liu

S3CE-Net: Spike-guided Spatiotemporal Semantic Coupling and Expansion Network for Long Sequence Event Re-Identification

May 30, 2025

Xianheng Ma, Hongchen Tan, Xiuping Liu, Yi Zhang, Huasheng Wang, Jiang Liu, Ying Chen, Hantao Liu

Abstract:In this paper, we leverage the advantages of event cameras to resist harsh lighting conditions, reduce background interference, achieve high time resolution, and protect facial information to study the long-sequence event-based person re-identification (Re-ID) task. To this end, we propose a simple and efficient long-sequence event Re-ID model, namely the Spike-guided Spatiotemporal Semantic Coupling and Expansion Network (S3CE-Net). To better handle asynchronous event data, we build S3CE-Net based on spiking neural networks (SNNs). The S3CE-Net incorporates the Spike-guided Spatial-temporal Attention Mechanism (SSAM) and the Spatiotemporal Feature Sampling Strategy (STFS). The SSAM is designed to carry out semantic interaction and association in both spatial and temporal dimensions, leveraging the capabilities of SNNs. The STFS involves sampling spatial feature subsequences and temporal feature subsequences from the spatiotemporal dimensions, driving the Re-ID model to perceive broader and more robust effective semantics. Notably, the STFS introduces no additional parameters and is only utilized during the training stage. Therefore, S3CE-Net is a low-parameter and high-efficiency model for long-sequence event-based person Re-ID. Extensive experiments have verified that our S3CE-Net achieves outstanding performance on many mainstream long-sequence event-based person Re-ID datasets. Code is available at:https://github.com/Mhsunshine/SC3E_Net.

Via

Access Paper or Ask Questions

Designing Pin-pression Gripper and Learning its Dexterous Grasping with Online In-hand Adjustment

May 25, 2025

Hewen Xiao, Xiuping Liu, Hang Zhao, Jian Liu, Kai Xu

Abstract:We introduce a novel design of parallel-jaw grippers drawing inspiration from pin-pression toys. The proposed pin-pression gripper features a distinctive mechanism in which each finger integrates a 2D array of pins capable of independent extension and retraction. This unique design allows the gripper to instantaneously customize its finger's shape to conform to the object being grasped by dynamically adjusting the extension/retraction of the pins. In addition, the gripper excels in in-hand re-orientation of objects for enhanced grasping stability again via dynamically adjusting the pins. To learn the dynamic grasping skills of pin-pression grippers, we devise a dedicated reinforcement learning algorithm with careful designs of state representation and reward shaping. To achieve a more efficient grasp-while-lift grasping mode, we propose a curriculum learning scheme. Extensive evaluations demonstrate that our design, together with the learned skills, leads to highly flexible and robust grasping with much stronger generality to unseen objects than alternatives. We also highlight encouraging physical results of sim-to-real transfer on a physically manufactured pin-pression gripper, demonstrating the practical significance of our novel gripper design and grasping skill. Demonstration videos for this paper are available at https://github.com/siggraph-pin-pression-gripper/pin-pression-gripper-video.

Via

Access Paper or Ask Questions

Multi-modal Collaborative Optimization and Expansion Network for Event-assisted Single-eye Expression Recognition

May 17, 2025

Runduo Han, Xiuping Liu, Shangxuan Yi, Yi Zhang, Hongchen Tan

Abstract:In this paper, we proposed a Multi-modal Collaborative Optimization and Expansion Network (MCO-E Net), to use event modalities to resist challenges such as low light, high exposure, and high dynamic range in single-eye expression recognition tasks. The MCO-E Net introduces two innovative designs: Multi-modal Collaborative Optimization Mamba (MCO-Mamba) and Heterogeneous Collaborative and Expansion Mixture-of-Experts (HCE-MoE). MCO-Mamba, building upon Mamba, leverages dual-modal information to jointly optimize the model, facilitating collaborative interaction and fusion of modal semantics. This approach encourages the model to balance the learning of both modalities and harness their respective strengths. HCE-MoE, on the other hand, employs a dynamic routing mechanism to distribute structurally varied experts (deep, attention, and focal), fostering collaborative learning of complementary semantics. This heterogeneous architecture systematically integrates diverse feature extraction paradigms to comprehensively capture expression semantics. Extensive experiments demonstrate that our proposed network achieves competitive performance in the task of single-eye expression recognition, especially under poor lighting conditions.

Via

Access Paper or Ask Questions

G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation

Mar 25, 2025

Juntao Jian, Xiuping Liu, Zixuan Chen, Manyi Li, Jian Liu, Ruizhen Hu

Figure 1 for G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation

Figure 2 for G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation

Figure 3 for G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation

Figure 4 for G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation

Abstract:Recent advances in dexterous grasping synthesis have demonstrated significant progress in producing reasonable and plausible grasps for many task purposes. But it remains challenging to generalize to unseen object categories and diverse task instructions. In this paper, we propose G-DexGrasp, a retrieval-augmented generation approach that can produce high-quality dexterous hand configurations for unseen object categories and language-based task instructions. The key is to retrieve generalizable grasping priors, including the fine-grained contact part and the affordance-related distribution of relevant grasping instances, for the following synthesis pipeline. Specifically, the fine-grained contact part and affordance act as generalizable guidance to infer reasonable grasping configurations for unseen objects with a generative model, while the relevant grasping distribution plays as regularization to guarantee the plausibility of synthesized grasps during the subsequent refinement optimization. Our comparison experiments validate the effectiveness of our key designs for generalization and demonstrate the remarkable performance against the existing approaches. Project page: https://g-dexgrasp.github.io/

* 11 pages, 5 figures

Via

Access Paper or Ask Questions

Robust Single Object Tracking in LiDAR Point Clouds under Adverse Weather Conditions

Jan 13, 2025

Xiantong Zhao, Xiuping Liu, Shengjing Tian, Yinan Han

Figure 1 for Robust Single Object Tracking in LiDAR Point Clouds under Adverse Weather Conditions

Figure 2 for Robust Single Object Tracking in LiDAR Point Clouds under Adverse Weather Conditions

Figure 3 for Robust Single Object Tracking in LiDAR Point Clouds under Adverse Weather Conditions

Figure 4 for Robust Single Object Tracking in LiDAR Point Clouds under Adverse Weather Conditions

Abstract:3D single object tracking (3DSOT) in LiDAR point clouds is a critical task for outdoor perception, enabling real-time perception of object location, orientation, and motion. Despite the impressive performance of current 3DSOT methods, evaluating them on clean datasets inadequately reflects their comprehensive performance, as the adverse weather conditions in real-world surroundings has not been considered. One of the main obstacles is the lack of adverse weather benchmarks for the evaluation of 3DSOT. To this end, this work proposes a challenging benchmark for LiDAR-based 3DSOT in adverse weather, which comprises two synthetic datasets (KITTI-A and nuScenes-A) and one real-world dataset (CADC-SOT) spanning three weather types: rain, fog, and snow. Based on this benchmark, five representative 3D trackers from different tracking frameworks conducted robustness evaluation, resulting in significant performance degradations. This prompts the question: What are the factors that cause current advanced methods to fail on such adverse weather samples? Consequently, we explore the impacts of adverse weather and answer the above question from three perspectives: 1) target distance; 2) template shape corruption; and 3) target shape corruption. Finally, based on domain randomization and contrastive learning, we designed a dual-branch tracking framework for adverse weather, named DRCT, achieving excellent performance in benchmarks.

* 14 pages

Via

Access Paper or Ask Questions

Evaluating the Robustness of LiDAR Point Cloud Tracking Against Adversarial Attack

Oct 28, 2024

Shengjing Tian, Yinan Han, Xiantong Zhao, Bin Liu, Xiuping Liu

Abstract:In this study, we delve into the robustness of neural network-based LiDAR point cloud tracking models under adversarial attacks, a critical aspect often overlooked in favor of performance enhancement. These models, despite incorporating advanced architectures like Transformer or Bird's Eye View (BEV), tend to neglect robustness in the face of challenges such as adversarial attacks, domain shifts, or data corruption. We instead focus on the robustness of the tracking models under the threat of adversarial attacks. We begin by establishing a unified framework for conducting adversarial attacks within the context of 3D object tracking, which allows us to thoroughly investigate both white-box and black-box attack strategies. For white-box attacks, we tailor specific loss functions to accommodate various tracking paradigms and extend existing methods such as FGSM, C\&W, and PGD to the point cloud domain. In addressing black-box attack scenarios, we introduce a novel transfer-based approach, the Target-aware Perturbation Generation (TAPG) algorithm, with the dual objectives of achieving high attack performance and maintaining low perceptibility. This method employs a heuristic strategy to enforce sparse attack constraints and utilizes random sub-vector factorization to bolster transferability. Our experimental findings reveal a significant vulnerability in advanced tracking methods when subjected to both black-box and white-box attacks, underscoring the necessity for incorporating robustness against adversarial attacks into the design of LiDAR point cloud tracking models. Notably, compared to existing methods, the TAPG also strikes an optimal balance between the effectiveness of the attack and the concealment of the perturbations.

Via

Access Paper or Ask Questions

Physics-Aware Iterative Learning and Prediction of Saliency Map for Bimanual Grasp Planning

Apr 13, 2024

Shiyao Wang, Xiuping Liu, Charlie C. L. Wang, Jian Liu

Figure 1 for Physics-Aware Iterative Learning and Prediction of Saliency Map for Bimanual Grasp Planning

Figure 2 for Physics-Aware Iterative Learning and Prediction of Saliency Map for Bimanual Grasp Planning

Figure 3 for Physics-Aware Iterative Learning and Prediction of Saliency Map for Bimanual Grasp Planning

Figure 4 for Physics-Aware Iterative Learning and Prediction of Saliency Map for Bimanual Grasp Planning

Abstract:Learning the skill of human bimanual grasping can extend the capabilities of robotic systems when grasping large or heavy objects. However, it requires a much larger search space for grasp points than single-hand grasping and numerous bimanual grasping annotations for network learning, making both data-driven or analytical grasping methods inefficient and insufficient. We propose a framework for bimanual grasp saliency learning that aims to predict the contact points for bimanual grasping based on existing human single-handed grasping data. We learn saliency corresponding vectors through minimal bimanual contact annotations that establishes correspondences between grasp positions of both hands, capable of eliminating the need for training a large-scale bimanual grasp dataset. The existing single-handed grasp saliency value serves as the initial value for bimanual grasp saliency, and we learn a saliency adjusted score that adds the initial value to obtain the final bimanual grasp saliency value, capable of predicting preferred bimanual grasp positions from single-handed grasp saliency. We also introduce a physics-balance loss function and a physics-aware refinement module that enables physical grasp balance, capable of enhancing the generalization of unknown objects. Comprehensive experiments in simulation and comparisons on dexterous grippers have demonstrated that our method can achieve balanced bimanual grasping effectively.

Via

Access Paper or Ask Questions

MirrorAttack: Backdoor Attack on 3D Point Cloud with a Distorting Mirror

Mar 09, 2024

Yuhao Bian, Shengjing Tian, Xiuping Liu

Figure 1 for MirrorAttack: Backdoor Attack on 3D Point Cloud with a Distorting Mirror

Figure 2 for MirrorAttack: Backdoor Attack on 3D Point Cloud with a Distorting Mirror

Figure 3 for MirrorAttack: Backdoor Attack on 3D Point Cloud with a Distorting Mirror

Figure 4 for MirrorAttack: Backdoor Attack on 3D Point Cloud with a Distorting Mirror

Abstract:The widespread deployment of Deep Neural Networks (DNNs) for 3D point cloud processing starkly contrasts with their susceptibility to security breaches, notably backdoor attacks. These attacks hijack DNNs during training, embedding triggers in the data that, once activated, cause the network to make predetermined errors while maintaining normal performance on unaltered data. This vulnerability poses significant risks, especially given the insufficient research on robust defense mechanisms for 3D point cloud networks against such sophisticated threats. Existing attacks either struggle to resist basic point cloud pre-processing methods, or rely on delicate manual design. Exploring simple, effective, imperceptible, and difficult-to-defend triggers in 3D point clouds is still challenging.To address these challenges, we introduce MirrorAttack, a novel effective 3D backdoor attack method, which implants the trigger by simply reconstructing a clean point cloud with an auto-encoder. The data-driven nature of the MirrorAttack obviates the need for complex manual design. Minimizing the reconstruction loss automatically improves imperceptibility. Simultaneously, the reconstruction network endows the trigger with pronounced nonlinearity and sample specificity, rendering traditional preprocessing techniques ineffective in eliminating it. A trigger smoothing module based on spherical harmonic transformation is also attached to regulate the intensity of the attack.Both quantitive and qualitative results verify the effectiveness of our method. We achieve state-of-the-art ASR on different types of victim models with the intervention of defensive techniques. Moreover, the minimal perturbation introduced by our trigger, as assessed by various metrics, attests to the method's stealth, ensuring its imperceptibility.

* 15 pages

Via

Access Paper or Ask Questions

Spectrum-guided Feature Enhancement Network for Event Person Re-Identification

Feb 02, 2024

Hongchen Tan, Yi Zhang, Xiuping Liu, Baocai Yin, Nan Ma, Xin Li, Huchuan Lu

Abstract:As a cutting-edge biosensor, the event camera holds significant potential in the field of computer vision, particularly regarding privacy preservation. However, compared to traditional cameras, event streams often contain noise and possess extremely sparse semantics, posing a formidable challenge for event-based person re-identification (event Re-ID). To address this, we introduce a novel event person re-identification network: the Spectrum-guided Feature Enhancement Network (SFE-Net). This network consists of two innovative components: the Multi-grain Spectrum Attention Mechanism (MSAM) and the Consecutive Patch Dropout Module (CPDM). MSAM employs a fourier spectrum transform strategy to filter event noise, while also utilizing an event-guided multi-granularity attention strategy to enhance and capture discriminative person semantics. CPDM employs a consecutive patch dropout strategy to generate multiple incomplete feature maps, encouraging the deep Re-ID model to equally perceive each effective region of the person's body and capture robust person descriptors. Extensive experiments on Event Re-ID datasets demonstrate that our SFE-Net achieves the best performance in this task.

Via

Access Paper or Ask Questions

Small Object Tracking in LiDAR Point Cloud: Learning the Target-awareness Prototype and Fine-grained Search Region

Jan 24, 2024

Shengjing Tian, Yinan Han, Xiuping Liu, Xiantong Zhao

Figure 1 for Small Object Tracking in LiDAR Point Cloud: Learning the Target-awareness Prototype and Fine-grained Search Region

Figure 2 for Small Object Tracking in LiDAR Point Cloud: Learning the Target-awareness Prototype and Fine-grained Search Region

Figure 3 for Small Object Tracking in LiDAR Point Cloud: Learning the Target-awareness Prototype and Fine-grained Search Region

Figure 4 for Small Object Tracking in LiDAR Point Cloud: Learning the Target-awareness Prototype and Fine-grained Search Region

Abstract:Single Object Tracking in LiDAR point cloud is one of the most essential parts of environmental perception, in which small objects are inevitable in real-world scenarios and will bring a significant barrier to the accurate location. However, the existing methods concentrate more on exploring universal architectures for common categories and overlook the challenges that small objects have long been thorny due to the relative deficiency of foreground points and a low tolerance for disturbances. To this end, we propose a Siamese network-based method for small object tracking in the LiDAR point cloud, which is composed of the target-awareness prototype mining (TAPM) module and the regional grid subdivision (RGS) module. The TAPM module adopts the reconstruction mechanism of the masked decoder to learn the prototype in the feature space, aiming to highlight the presence of foreground points that will facilitate the subsequent location of small objects. Through the above prototype is capable of accentuating the small object of interest, the positioning deviation in feature maps still leads to high tracking errors. To alleviate this issue, the RGS module is proposed to recover the fine-grained features of the search region based on ViT and pixel shuffle layers. In addition, apart from the normal settings, we elaborately design a scaling experiment to evaluate the robustness of the different trackers on small objects. Extensive experiments on KITTI and nuScenes demonstrate that our method can effectively improve the tracking performance of small targets without affecting normal-sized objects.

Via

Access Paper or Ask Questions