Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fabio Martinez

Spatio-temporal transformer to support automatic sign language translation

Feb 04, 2025

Christian Ruiz, Fabio Martinez

Abstract:Sign Language Translation (SLT) systems support hearing-impaired people communication by finding equivalences between signed and spoken languages. This task is however challenging due to multiple sign variations, complexity in language and inherent richness of expressions. Computational approaches have evidenced capabilities to support SLT. Nonetheless, these approaches remain limited to cover gestures variability and support long sequence translations. This paper introduces a Transformer-based architecture that encodes spatio-temporal motion gestures, preserving both local and long-range spatial information through the use of multiple convolutional and attention mechanisms. The proposed approach was validated on the Colombian Sign Language Translation Dataset (CoL-SLTD) outperforming baseline approaches, and achieving a BLEU4 of 46.84%. Additionally, the proposed approach was validated on the RWTH-PHOENIX-Weather-2014T (PHOENIX14T), achieving a BLEU4 score of 30.77%, demonstrating its robustness and effectiveness in handling real-world variations

Via

Access Paper or Ask Questions

COLON: The largest COlonoscopy LONg sequence public database

Mar 01, 2024

Lina Ruiz, Franklin Sierra-Jerez, Jair Ruiz, Fabio Martinez

Abstract:Colorectal cancer is the third most aggressive cancer worldwide. Polyps, as the main biomarker of the disease, are detected, localized, and characterized through colonoscopy procedures. Nonetheless, during the examination, up to 25% of polyps are missed, because of challenging conditions (camera movements, lighting changes), and the close similarity of polyps and intestinal folds. Besides, there is a remarked subjectivity and expert dependency to observe and detect abnormal regions along the intestinal tract. Currently, publicly available polyp datasets have allowed significant advances in computational strategies dedicated to characterizing non-parametric polyp shapes. These computational strategies have achieved remarkable scores of up to 90% in segmentation tasks. Nonetheless, these strategies operate on cropped and expert-selected frames that always observe polyps. In consequence, these computational approximations are far from clinical scenarios and real applications, where colonoscopies are redundant on intestinal background with high textural variability. In fact, the polyps typically represent less than 1% of total observations in a complete colonoscopy record. This work introduces COLON: the largest COlonoscopy LONg sequence dataset with around of 30 thousand polyp labeled frames and 400 thousand background frames. The dataset was collected from a total of 30 complete colonoscopies with polyps at different stages, variations in preparation procedures, and some cases the observation of surgical instrumentation. Additionally, 10 full intestinal background video control colonoscopies were integrated in order to achieve a robust polyp-background frame differentiation. The COLON dataset is open to the scientific community to bring new scenarios to propose computational tools dedicated to polyp detection and segmentation over long sequences, being closer to real colonoscopy scenarios.

* 7 pages, 3 figures, 3 tables

Via

Access Paper or Ask Questions

Parkinson gait modelling from an anomaly deep representation

Jan 26, 2023

Edgar Rangel, Fabio Martinez

Abstract:Parkinson's Disease is associated with gait movement disorders, such as postural instability, stiffness, and tremors. Today, some approaches implemented learning representations to quantify kinematic patterns during locomotion, supporting clinical procedures such as diagnosis and treatment planning. These approaches assumes a large amount of stratified and labeled data to optimize discriminative representations. Nonetheless, these considerations may restrict the operability of approaches in real scenarios during clinical practice. This work introduces a self-supervised generative representation, under the pretext of video reconstruction and anomaly detection framework. This architecture is trained following a one-class weakly supervised learning to avoid inter-class variance and approach the multiple relationships that represent locomotion. For validation 14 PD patients and 23 control subjects were recorded, and trained with the control population only, achieving an AUC of 86.9%, homoscedasticity level of 80% and shapeness level of 70% in the classification task considering its generalization.

* Submitted to Pattern Recognition Journal, 17 pages

Via

Access Paper or Ask Questions