Abstract:The anthropomorphism of grasping process significantly benefits the experience and grasping efficiency of prosthetic hand wearers. Currently, prosthetic hands controlled by signals such as brain-computer interfaces (BCI) and electromyography (EMG) face difficulties in precisely recognizing the amputees' grasping gestures and executing anthropomorphic grasp processes. Although prosthetic hands equipped with vision systems enables the objects' feature recognition, they lack perception of human grasping intention. Therefore, this paper explores the estimation of grasping gestures solely through visual data to accomplish anthropopathic grasping control and the determination of grasping intention within a multi-object environment. To address this, we propose the Spatial Geometry-based Gesture Mapping (SG-GM) method, which constructs gesture functions based on the geometric features of the human hand grasping processes. It's subsequently implemented on the prosthetic hand. Furthermore, we propose the Motion Trajectory Regression-based Grasping Intent Estimation (MTR-GIE) algorithm. This algorithm predicts pre-grasping object utilizing regression prediction and prior spatial segmentation estimation derived from the prosthetic hand's position and trajectory. The experiments were conducted to grasp 8 common daily objects including cup, fork, etc. The experimental results presented a similarity coefficient $R^{2}$ of grasping process of 0.911, a Root Mean Squared Error ($RMSE$) of 2.47\degree, a success rate of grasping of 95.43$\%$, and an average duration of grasping process of 3.07$\pm$0.41 s. Furthermore, grasping experiments in a multi-object environment were conducted. The average accuracy of intent estimation reached 94.35$\%$. Our methodologies offer a groundbreaking approach to enhance the prosthetic hand's functionality and provides valuable insights for future research.
Abstract:Computational pathology (CPath) has significantly advanced the clinical practice of pathology. Despite the progress made, Multiple Instance Learning (MIL), a promising paradigm within CPath, continues to face challenges, particularly related to incomplete information utilization. Existing frameworks, such as those based on Convolutional Neural Networks (CNNs), attention, and selective scan space state sequential model (SSM), lack sufficient flexibility and scalability in fusing diverse features, and cannot effectively fuse diverse features. Additionally, current approaches do not adequately exploit order-related and order-independent features, resulting in suboptimal utilization of sequence information. To address these limitations, we propose a novel MIL framework called Mamba2MIL. Our framework utilizes the state space duality model (SSD) to model long sequences of patches of whole slide images (WSIs), which, combined with weighted feature selection, supports the fusion processing of more branching features and can be extended according to specific application needs. Moreover, we introduce a sequence transformation method tailored to varying WSI sizes, which enhances sequence-independent features while preserving local sequence information, thereby improving sequence information utilization. Extensive experiments demonstrate that Mamba2MIL surpasses state-of-the-art MIL methods. We conducted extensive experiments across multiple datasets, achieving improvements in nearly all performance metrics. Specifically, on the NSCLC dataset, Mamba2MIL achieves a binary tumor classification AUC of 0.9533 and an accuracy of 0.8794. On the BRACS dataset, it achieves a multiclass classification AUC of 0.7986 and an accuracy of 0.4981. The code is available at https://github.com/YuqiZhang-Buaa/Mamba2MIL.