Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaotong Zhang

Eye Movement Feature-Guided Signal De-Drifting in Electrooculography Systems

Sep 09, 2025

Lianming Hu, Xiaotong Zhang, Kamal Youcef-Toumi

Abstract:Electrooculography (EOG) is widely used for gaze tracking in Human-Robot Collaboration (HRC). However, baseline drift caused by low-frequency noise significantly impacts the accuracy of EOG signals, creating challenges for further sensor fusion. This paper presents an Eye Movement Feature-Guided De-drift (FGD) method for mitigating drift artifacts in EOG signals. The proposed approach leverages active eye-movement feature recognition to reconstruct the feature-extracted EOG baseline and adaptively correct signal drift while preserving the morphological integrity of the EOG waveform. The FGD is evaluated using both simulation data and real-world data, achieving a significant reduction in mean error. The average error is reduced to 0.896{\deg} in simulation, representing a 36.29% decrease, and to 1.033{\deg} in real-world data, corresponding to a 26.53% reduction. Despite additional and unpredictable noise in real-world data, the proposed method consistently outperforms conventional de-drifting techniques, demonstrating its effectiveness in practical applications such as enhancing human performance augmentation.

* This manuscript has been accepted for presentation at the 2025 IEEE 21st International Conference on Automation Science and Engineering (CASE) and is currently under publication

Via

Access Paper or Ask Questions

Top-K Maximum Intensity Projection Priors for 3D Liver Vessel Segmentation

Mar 05, 2025

Xiaotong Zhang, Alexander Broersen, Gonnie CM van Erp, Silvia L. Pintea, Jouke Dijkstra

Abstract:Liver-vessel segmentation is an essential task in the pre-operative planning of liver resection. State-of-the-art 2D or 3D convolution-based methods focusing on liver vessel segmentation on 2D CT cross-sectional views, which do not take into account the global liver-vessel topology. To maintain this global vessel topology, we rely on the underlying physics used in the CT reconstruction process, and apply this to liver-vessel segmentation. Concretely, we introduce the concept of top-k maximum intensity projections, which mimics the CT reconstruction by replacing the integral along each projection direction, with keeping the top-k maxima along each projection direction. We use these top-k maximum projections to condition a diffusion model and generate 3D liver-vessel trees. We evaluate our 3D liver-vessel segmentation on the 3D-ircadb-01 dataset, and achieve the highest Dice coefficient, intersection-over-union (IoU), and Sensitivity scores compared to prior work.

* Accepted in 2025 IEEE International Symposium on Biomedical Imaging (ISBI 2025)

Via

Access Paper or Ask Questions

Reinforcement Learning with Curriculum-inspired Adaptive Direct Policy Guidance for Truck Dispatching

Feb 28, 2025

Shi Meng, Bin Tian, Xiaotong Zhang

Figure 1 for Reinforcement Learning with Curriculum-inspired Adaptive Direct Policy Guidance for Truck Dispatching

Figure 2 for Reinforcement Learning with Curriculum-inspired Adaptive Direct Policy Guidance for Truck Dispatching

Figure 3 for Reinforcement Learning with Curriculum-inspired Adaptive Direct Policy Guidance for Truck Dispatching

Figure 4 for Reinforcement Learning with Curriculum-inspired Adaptive Direct Policy Guidance for Truck Dispatching

Abstract:Efficient truck dispatching via Reinforcement Learning (RL) in open-pit mining is often hindered by reliance on complex reward engineering and value-based methods. This paper introduces Curriculum-inspired Adaptive Direct Policy Guidance, a novel curriculum learning strategy for policy-based RL to address these issues. We adapt Proximal Policy Optimization (PPO) for mine dispatching's uneven decision intervals using time deltas in Temporal Difference and Generalized Advantage Estimation, and employ a Shortest Processing Time teacher policy for guided exploration via policy regularization and adaptive guidance. Evaluations in OpenMines demonstrate our approach yields a 10% performance gain and faster convergence over standard PPO across sparse and dense reward settings, showcasing improved robustness to reward design. This direct policy guidance method provides a general and effective curriculum learning technique for RL-based truck dispatching, enabling future work on advanced architectures.

Via

Access Paper or Ask Questions

A Graph Attention-Guided Diffusion Model for Liver Vessel Segmentation

Nov 01, 2024

Xiaotong Zhang, Alexander Broersen, Gonnie CM van Erp, Silvia L. Pintea, Jouke Dijkstra

Abstract:Improving connectivity and completeness are the most challenging aspects of small liver vessel segmentation. It is difficult for existing methods to obtain segmented liver vessel trees simultaneously with continuous geometry and detail in small vessels. We proposed a diffusion model-based method with a multi-scale graph attention guidance to break through the bottleneck to segment the liver vessels. Experiments show that the proposed method outperforms the other state-of-the-art methods used in this study on two public datasets of 3D-ircadb-01 and LiVS. Dice coefficient and Sensitivity are improved by at least 11.67% and 24.21% on 3D-ircadb-01 dataset, and are improved by at least 3.21% and 9.11% on LiVS dataset. Connectivity is also quantitatively evaluated in this study and our method performs best. The proposed method is reliable for small liver vessel segmentation.

* This work has been submitted to the IEEE for possible publication

Via

Access Paper or Ask Questions

Relevance-driven Decision Making for Safer and More Efficient Human Robot Collaboration

Sep 21, 2024

Xiaotong Zhang, Dingcheng Huang, Kamal Youcef-Toumi

Figure 1 for Relevance-driven Decision Making for Safer and More Efficient Human Robot Collaboration

Figure 2 for Relevance-driven Decision Making for Safer and More Efficient Human Robot Collaboration

Figure 3 for Relevance-driven Decision Making for Safer and More Efficient Human Robot Collaboration

Figure 4 for Relevance-driven Decision Making for Safer and More Efficient Human Robot Collaboration

Abstract:Human intelligence possesses the ability to effectively focus on important environmental components, which enhances perception, learning, reasoning, and decision-making. Inspired by this cognitive mechanism, we introduced a novel concept termed relevance for Human-Robot Collaboration (HRC). Relevance is defined as the importance of the objects based on the applicability and pertinence of the objects for the human objective or other factors. In this paper, we further developed a novel two-loop framework integrating real-time and asynchronous processing to quantify relevance and apply relevance for safer and more efficient HRC. The asynchronous loop leverages the world knowledge from an LLM and quantifies relevance, and the real-time loop executes scene understanding, human intent prediction, and decision-making based on relevance. In decision making, we proposed and developed a human robot task allocation method based on relevance and a novel motion generation and collision avoidance methodology considering the prediction of human trajectory. Simulations and experiments show that our methodology for relevance quantification can accurately and robustly predict the human objective and relevance, with an average accuracy of up to 0.90 for objective prediction and up to 0.96 for relevance prediction. Moreover, our motion generation methodology reduces collision cases by 63.76% and collision frames by 44.74% when compared with a state-of-the-art (SOTA) collision avoidance method. Our framework and methodologies, with relevance, guide the robot on how to best assist humans and generate safer and more efficient actions for HRC.

Via

Access Paper or Ask Questions

Relevance for Human Robot Collaboration

Sep 12, 2024

Xiaotong Zhang, Dingcheng Huang, Kamal Youcef-Toumi

Figure 1 for Relevance for Human Robot Collaboration

Figure 2 for Relevance for Human Robot Collaboration

Figure 3 for Relevance for Human Robot Collaboration

Figure 4 for Relevance for Human Robot Collaboration

Abstract:Effective human-robot collaboration (HRC) requires the robots to possess human-like intelligence. Inspired by the human's cognitive ability to selectively process and filter elements in complex environments, this paper introduces a novel concept and scene-understanding approach termed `relevance.' It identifies relevant components in a scene. To accurately and efficiently quantify relevance, we developed an event-based framework that selectively triggers relevance determination, along with a probabilistic methodology built on a structured scene representation. Simulation results demonstrate that the relevance framework and methodology accurately predict the relevance of a general HRC setup, achieving a precision of 0.99 and a recall of 0.94. Relevance can be broadly applied to several areas in HRC to improve task planning time by 79.56% compared with pure planning for a cereal task, reduce perception latency by up to 26.53% for an object detector, improve HRC safety by up to 13.50% and reduce the number of inquiries for HRC by 75.36%. A real-world demonstration showcases the relevance framework's ability to intelligently assist humans in everyday tasks.

Via

Access Paper or Ask Questions

Subgoal-based Hierarchical Reinforcement Learning for Multi-Agent Collaboration

Aug 21, 2024

Cheng Xu, Changtian Zhang, Yuchen Shi, Ran Wang, Shihong Duan, Yadong Wan, Xiaotong Zhang

Abstract:Recent advancements in reinforcement learning have made significant impacts across various domains, yet they often struggle in complex multi-agent environments due to issues like algorithm instability, low sampling efficiency, and the challenges of exploration and dimensionality explosion. Hierarchical reinforcement learning (HRL) offers a structured approach to decompose complex tasks into simpler sub-tasks, which is promising for multi-agent settings. This paper advances the field by introducing a hierarchical architecture that autonomously generates effective subgoals without explicit constraints, enhancing both flexibility and stability in training. We propose a dynamic goal generation strategy that adapts based on environmental changes. This method significantly improves the adaptability and sample efficiency of the learning process. Furthermore, we address the critical issue of credit assignment in multi-agent systems by synergizing our hierarchical architecture with a modified QMIX network, thus improving overall strategy coordination and efficiency. Comparative experiments with mainstream reinforcement learning algorithms demonstrate the superior convergence speed and performance of our approach in both single-agent and multi-agent environments, confirming its effectiveness and flexibility in complex scenarios. Our code is open-sourced at: \url{https://github.com/SICC-Group/GMAH}.

Via

Access Paper or Ask Questions

Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing

May 06, 2024

Han Liu, Siyang Zhao, Xiaotong Zhang, Feng Zhang, Wei Wang, Fenglong Ma, Hongyang Chen, Hong Yu, Xianchao Zhang

Figure 1 for Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing

Figure 2 for Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing

Figure 3 for Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing

Figure 4 for Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing

Abstract:Few-shot and zero-shot text classification aim to recognize samples from novel classes with limited labeled samples or no labeled samples at all. While prevailing methods have shown promising performance via transferring knowledge from seen classes to unseen classes, they are still limited by (1) Inherent dissimilarities among classes make the transformation of features learned from seen classes to unseen classes both difficult and inefficient. (2) Rare labeled novel samples usually cannot provide enough supervision signals to enable the model to adjust from the source distribution to the target distribution, especially for complicated scenarios. To alleviate the above issues, we propose a simple and effective strategy for few-shot and zero-shot text classification. We aim to liberate the model from the confines of seen classes, thereby enabling it to predict unseen categories without the necessity of training on seen classes. Specifically, for mining more related unseen category knowledge, we utilize a large pre-trained language model to generate pseudo novel samples, and select the most representative ones as category anchors. After that, we convert the multi-class classification task into a binary classification task and use the similarities of query-anchor pairs for prediction to fully leverage the limited supervision signals. Extensive experiments on six widely used public datasets show that our proposed method can outperform other strong baselines significantly in few-shot and zero-shot tasks, even without using any seen class samples.

* Accepted to AAAI 2024

Via

Access Paper or Ask Questions

HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text

Feb 02, 2024

Han Liu, Zhi Xu, Xiaotong Zhang, Feng Zhang, Fenglong Ma, Hongyang Chen, Hong Yu, Xianchao Zhang

Figure 1 for HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text

Figure 2 for HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text

Figure 3 for HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text

Figure 4 for HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text

Abstract:Black-box hard-label adversarial attack on text is a practical and challenging task, as the text data space is inherently discrete and non-differentiable, and only the predicted label is accessible. Research on this problem is still in the embryonic stage and only a few methods are available. Nevertheless, existing methods rely on the complex heuristic algorithm or unreliable gradient estimation strategy, which probably fall into the local optimum and inevitably consume numerous queries, thus are difficult to craft satisfactory adversarial examples with high semantic similarity and low perturbation rate in a limited query budget. To alleviate above issues, we propose a simple yet effective framework to generate high quality textual adversarial examples under the black-box hard-label attack scenarios, named HQA-Attack. Specifically, after initializing an adversarial example randomly, HQA-attack first constantly substitutes original words back as many as possible, thus shrinking the perturbation rate. Then it leverages the synonym set of the remaining changed words to further optimize the adversarial example with the direction which can improve the semantic similarity and satisfy the adversarial condition simultaneously. In addition, during the optimizing procedure, it searches a transition synonym word for each changed word, thus avoiding traversing the whole synonym set and reducing the query number to some extent. Extensive experimental results on five text classification datasets, three natural language inference datasets and two real-world APIs have shown that the proposed HQA-Attack method outperforms other strong baselines significantly.

Via

Access Paper or Ask Questions

How Does Perception Affect Safety: New Metrics and Strategy

Dec 12, 2023

Xiaotong Zhang, Jinger Chong, Kamal Youcef-Toumi

Figure 1 for How Does Perception Affect Safety: New Metrics and Strategy

Figure 2 for How Does Perception Affect Safety: New Metrics and Strategy

Figure 3 for How Does Perception Affect Safety: New Metrics and Strategy

Figure 4 for How Does Perception Affect Safety: New Metrics and Strategy

Abstract:Perception serves as a critical component in the functionality of autonomous agents. However, the intricate relationship between perception metrics and robotic metrics remains unclear, leading to ambiguity in the development and fine-tuning of perception algorithms. In this paper, we introduce a methodology for quantifying this relationship, taking into account factors such as detection rate, detection quality, and latency. Furthermore, we introduce two novel metrics for Human-Robot Collaboration safety predicated upon perception metrics: Critical Collision Probability (CCP) and Average Collision Probability (ACP). To validate the utility of these metrics in facilitating algorithm development and tuning, we develop an attentive processing strategy that focuses exclusively on key input features. This approach significantly reduces computational time while preserving a similar level of accuracy. Experimental results indicate that the implementation of this strategy in an object detector leads to a maximum reduction of 30.091% in inference time and 26.534% in total time per frame. Additionally, the strategy lowers the CCP and ACP in a baseline model by 11.252% and 13.501%, respectively. The source code will be made publicly available in the final proof version of the manuscript.

Via

Access Paper or Ask Questions