Abstract:User engagement, cognitive participation, and motivation during task execution in physical human-robot interaction are crucial for motor learning. These factors are especially important in contexts like robotic rehabilitation, where neuroplasticity is targeted. However, traditional robotic rehabilitation systems often face challenges in maintaining user engagement, leading to unpredictable therapeutic outcomes. To address this issue, various techniques, such as assist-as-needed controllers, have been developed to prevent user slacking and encourage active participation. In this paper, we introduce a new direction through a novel multi-modal robotic interaction designed to enhance user engagement by synergistically integrating visual, motor, cognitive, and auditory (speech recognition) tasks into a single, comprehensive activity. To assess engagement quantitatively, we compared multiple electroencephalography (EEG) biomarkers between this multi-modal protocol and a traditional motor-only protocol. Fifteen healthy adult participants completed 100 trials of each task type. Our findings revealed that EEG biomarkers, particularly relative alpha power, showed statistically significant improvements in engagement during the multi-modal task compared to the motor-only task. Moreover, while engagement decreased over time in the motor-only task, the multi-modal protocol maintained consistent engagement, suggesting that users could remain engaged for longer therapy sessions. Our observations on neural responses during interaction indicate that the proposed multi-modal approach can effectively enhance user engagement, which is critical for improving outcomes. This is the first time that objective neural response highlights the benefit of a comprehensive robotic intervention combining motor, cognitive, and auditory functions in healthy subjects.
Abstract:Developing accurate hand gesture perception models is critical for various robotic applications, enabling effective communication between humans and machines and directly impacting neurorobotics and interactive robots. Recently, surface electromyography (sEMG) has been explored for its rich informational context and accessibility when combined with advanced machine learning approaches and wearable systems. The literature presents numerous approaches to boost performance while ensuring robustness for neurorobots using sEMG, often resulting in models requiring high processing power, large datasets, and less scalable solutions. This paper addresses this challenge by proposing the decoding of muscle synchronization rather than individual muscle activation. We study coherence-based functional muscle networks as the core of our perception model, proposing that functional synchronization between muscles and the graph-based network of muscle connectivity encode contextual information about intended hand gestures. This can be decoded using shallow machine learning approaches without the need for deep temporal networks. Our technique could impact myoelectric control of neurorobots by reducing computational burdens and enhancing efficiency. The approach is benchmarked on the Ninapro database, which contains 12 EMG signals from 40 subjects performing 17 hand gestures. It achieves an accuracy of 85.1%, demonstrating improved performance compared to existing methods while requiring much less computational power. The results support the hypothesis that a coherence-based functional muscle network encodes critical information related to gesture execution, significantly enhancing hand gesture perception with potential applications for neurorobotic systems and interactive machines.
Abstract:Surface Electromyography (sEMG) is a non-invasive signal that is used in the recognition of hand movement patterns, the diagnosis of diseases, and the robust control of prostheses. Despite the remarkable success of recent end-to-end Deep Learning approaches, they are still limited by the need for large amounts of labeled data. To alleviate the requirement for big data, researchers utilize Feature Engineering, which involves decomposing the sEMG signal into several spatial, temporal, and frequency features. In this paper, we propose utilizing a feature-imitating network (FIN) for closed-form temporal feature learning over a 300ms signal window on Ninapro DB2, and applying it to the task of 17 hand movement recognition. We implement a lightweight LSTM-FIN network to imitate four standard temporal features (entropy, root mean square, variance, simple square integral). We then explore transfer learning capabilities by applying the pre-trained LSTM-FIN for tuning to a downstream hand movement recognition task. We observed that the LSTM network can achieve up to 99\% R2 accuracy in feature reconstruction and 80\% accuracy in hand movement recognition. Our results also showed that the model can be robustly applied for both within- and cross-subject movement recognition, as well as simulated low-latency environments. Overall, our work demonstrates the potential of the FIN modeling paradigm in data-scarce scenarios for sEMG signal processing.
Abstract:Surface electromyography (sEMG) and high-density sEMG (HD-sEMG) biosignals have been extensively investigated for myoelectric control of prosthetic devices, neurorobotics, and more recently human-computer interfaces because of their capability for hand gesture recognition/prediction in a wearable and non-invasive manner. High intraday (same-day) performance has been reported. However, the interday performance (separating training and testing days) is substantially degraded due to the poor generalizability of conventional approaches over time, hindering the application of such techniques in real-life practices. There are limited recent studies on the feasibility of multi-day hand gesture recognition. The existing studies face a major challenge: the need for long sEMG epochs makes the corresponding neural interfaces impractical due to the induced delay in myoelectric control. This paper proposes a compact ViT-based network for multi-day dynamic hand gesture prediction. We tackle the main challenge as the proposed model only relies on very short HD-sEMG signal windows (i.e., 50 ms, accounting for only one-sixth of the convention for real-time myoelectric implementation), boosting agility and responsiveness. Our proposed model can predict 11 dynamic gestures for 20 subjects with an average accuracy of over 71% on the testing day, 3-25 days after training. Moreover, when calibrated on just a small portion of data from the testing day, the proposed model can achieve over 92% accuracy by retraining less than 10% of the parameters for computational efficiency.
Abstract:In the past decade, there has been significant advancement in designing wearable neural interfaces for controlling neurorobotic systems, particularly bionic limbs. These interfaces function by decoding signals captured non-invasively from the skin's surface. Portable high-density surface electromyography (HD-sEMG) modules combined with deep learning decoding have attracted interest by achieving excellent gesture prediction and myoelectric control of prosthetic systems and neurorobots. However, factors like pixel-shape electrode size and unstable skin contact make HD-sEMG susceptible to pixel electrode drops. The sparse electrode-skin disconnections rooted in issues such as low adhesion, sweating, hair blockage, and skin stretch challenge the reliability and scalability of these modules as the perception unit for neurorobotic systems. This paper proposes a novel deep-learning model providing resiliency for HD-sEMG modules, which can be used in the wearable interfaces of neurorobots. The proposed 3D Dilated Efficient CapsNet model trains on an augmented input space to computationally `force' the network to learn channel dropout variations and thus learn robustness to channel dropout. The proposed framework maintained high performance under a sensor dropout reliability study conducted. Results show conventional models' performance significantly degrades with dropout and is recovered using the proposed architecture and the training paradigm.
Abstract:This paper proposes a smart handheld textural sensing medical device with complementary Machine Learning (ML) algorithms to enable on-site Colorectal Cancer (CRC) polyp diagnosis and pathology of excised tumors. The proposed unique handheld edge device benefits from a unique tactile sensing module and a dual-stage machine learning algorithms (composed of a dilated residual network and a t-SNE engine) for polyp type and stiffness characterization. Solely utilizing the occlusion-free, illumination-resilient textural images captured by the proposed tactile sensor, the framework is able to sensitively and reliably identify the type and stage of CRC polyps by classifying their texture and stiffness, respectively. Moreover, the proposed handheld medical edge device benefits from internet connectivity for enabling remote digital pathology (boosting the diagnosis in operating rooms and promoting accessibility and equity in medical diagnosis).
Abstract:This paper investigates the influence of the internal geometrical structure of soft pneu-nets on the dynamic response and hysteresis of the actuators. The research findings indicate that by strategically manipulating the stress distribution within soft robots, it is possible to enhance the dynamic response while reducing hysteresis. The study utilizes the Finite Element Method (FEM) and includes experimental validation through markerless motion tracking of the soft robot. In particular, the study examines actuator bending angles up to 500% strain while achieving 95% accuracy in predicting the bending angle. The results demonstrate that the particular design with the minimum air chamber width in the center significantly improves both high- and low-frequency hysteresis behavior by 21.5% while also enhancing dynamic response by 60% to 112% across various frequencies and peak-to-peak pressures. Consequently, the paper evaluates the effectiveness of "mechanically programming" stress distribution and distributed energy storage within soft robots to maximize their dynamic performance, offering direct benefits for control.
Abstract:Recently skew-t mixture models have been introduced as a flexible probabilistic modeling technique taking into account both skewness in data clusters and the statistical degree of freedom (S-DoF) to improve modeling generalizability, and robustness to heavy tails and skewness. In this paper, we show that the state-of-the-art skew-t mixture models fundamentally suffer from a hidden phenomenon named here as "S-DoF explosion," which results in local minima in the shapes of normal kernels during the non-convex iterative process of expectation maximization. For the first time, this paper provides insights into the instability of the S-DoF, which can result in the divergence of the kernels from the mixture of t-distribution, losing generalizability and power for modeling the outliers. Thus, in this paper, we propose a regularized iterative optimization process to train the mixture model, enhancing the generalizability and resiliency of the technique. The resulting mixture model is named Finite Mixture of Multivariate Regulated Skew-t (FiMReSt) Kernels, which stabilizes the S-DoF profile during optimization process of learning. To validate the performance, we have conducted a comprehensive experiment on several real-world datasets and a synthetic dataset. The results highlight (a) superior performance of the FiMReSt, (b) generalizability in the presence of outliers, and (c) convergence of S-DoF.
Abstract:The intrinsic biomechanical characteristic of the human upper limb plays a central role in absorbing the interactive energy during physical human-robot interaction (pHRI). We have recently shown that based on the concept of ``Excess of Passivity (EoP)," from nonlinear control theory, it is possible to decode such energetic behavior for both upper and lower limbs. The extracted knowledge can be used in the design of controllers for optimizing the transparency and fidelity of force fields in human-robot interaction and in haptic systems. In this paper, for the first time, we investigate the frequency behavior of the passivity map for the upper limb when the muscle co-activation was controlled in real-time through visual electromyographic feedback. Five healthy subjects (age: 27 +/- 5) were included in this study. The energetic behavior was evaluated at two stimulation frequencies at eight interaction directions over two controlled muscle co-activation levels. Electromyography (EMG) was captured using the Delsys Wireless Trigno system. Results showed a correlation between EMG and EoP, which was further altered by increasing the frequency. The proposed energetic behavior is named the Geometric MyoPassivity (GMP) map. The findings indicate that the GMP map has the potential to be used in real-time to quantify the absorbable energy, thus passivity margin of stability for upper limb interaction during pHRI.
Abstract:Designing efficient and labor-saving prosthetic hands requires powerful hand gesture recognition algorithms that can achieve high accuracy with limited complexity and latency. In this context, the paper proposes a compact deep learning framework referred to as the CT-HGR, which employs a vision transformer network to conduct hand gesture recognition using highdensity sEMG (HD-sEMG) signals. The attention mechanism in the proposed model identifies similarities among different data segments with a greater capacity for parallel computations and addresses the memory limitation problems while dealing with inputs of large sequence lengths. CT-HGR can be trained from scratch without any need for transfer learning and can simultaneously extract both temporal and spatial features of HD-sEMG data. Additionally, the CT-HGR framework can perform instantaneous recognition using sEMG image spatially composed from HD-sEMG signals. A variant of the CT-HGR is also designed to incorporate microscopic neural drive information in the form of Motor Unit Spike Trains (MUSTs) extracted from HD-sEMG signals using Blind Source Separation (BSS). This variant is combined with its baseline version via a hybrid architecture to evaluate potentials of fusing macroscopic and microscopic neural drive information. The utilized HD-sEMG dataset involves 128 electrodes that collect the signals related to 65 isometric hand gestures of 20 subjects. The proposed CT-HGR framework is applied to 31.25, 62.5, 125, 250 ms window sizes of the above-mentioned dataset utilizing 32, 64, 128 electrode channels. The average accuracy over all the participants using 32 electrodes and a window size of 31.25 ms is 86.23%, which gradually increases till reaching 91.98% for 128 electrodes and a window size of 250 ms. The CT-HGR achieves accuracy of 89.13% for instantaneous recognition based on a single frame of HD-sEMG image.