Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mohammad Ali Armin

Developing Normative Gait Cycle Parameters for Clinical Analysis Using Human Pose Estimation

Nov 20, 2024

Rahm Ranjan, David Ahmedt-Aristizabal, Mohammad Ali Armin, Juno Kim

Figure 1 for Developing Normative Gait Cycle Parameters for Clinical Analysis Using Human Pose Estimation

Figure 2 for Developing Normative Gait Cycle Parameters for Clinical Analysis Using Human Pose Estimation

Figure 3 for Developing Normative Gait Cycle Parameters for Clinical Analysis Using Human Pose Estimation

Figure 4 for Developing Normative Gait Cycle Parameters for Clinical Analysis Using Human Pose Estimation

Abstract:Gait analysis using computer vision is an emerging field in AI, offering clinicians an objective, multi-feature approach to analyse complex movements. Despite its promise, current applications using RGB video data alone are limited in measuring clinically relevant spatial and temporal kinematics and establishing normative parameters essential for identifying movement abnormalities within a gait cycle. This paper presents a data-driven method using RGB video data and 2D human pose estimation for developing normative kinematic gait parameters. By analysing joint angles, an established kinematic measure in biomechanics and clinical practice, we aim to enhance gait analysis capabilities and improve explainability. Our cycle-wise kinematic analysis enables clinicians to simultaneously measure and compare multiple joint angles, assessing individuals against a normative population using just monocular RGB video. This approach expands clinical capacity, supports objective decision-making, and automates the identification of specific spatial and temporal deviations and abnormalities within the gait cycle.

Via

Access Paper or Ask Questions

Computer Vision for Clinical Gait Analysis: A Gait Abnormality Video Dataset

Jul 05, 2024

Rahm Ranjan, David Ahmedt-Aristizabal, Mohammad Ali Armin, Juno Kim

Figure 1 for Computer Vision for Clinical Gait Analysis: A Gait Abnormality Video Dataset

Figure 2 for Computer Vision for Clinical Gait Analysis: A Gait Abnormality Video Dataset

Figure 3 for Computer Vision for Clinical Gait Analysis: A Gait Abnormality Video Dataset

Figure 4 for Computer Vision for Clinical Gait Analysis: A Gait Abnormality Video Dataset

Abstract:Clinical gait analysis (CGA) using computer vision is an emerging field in artificial intelligence that faces barriers of accessible, real-world data, and clear task objectives. This paper lays the foundation for current developments in CGA as well as vision-based methods and datasets suitable for gait analysis. We introduce The Gait Abnormality in Video Dataset (GAVD) in response to our review of over 150 current gait-related computer vision datasets, which highlighted the need for a large and accessible gait dataset clinically annotated for CGA. GAVD stands out as the largest video gait dataset, comprising 1874 sequences of normal, abnormal and pathological gaits. Additionally, GAVD includes clinically annotated RGB data sourced from publicly available content on online platforms. It also encompasses over 400 subjects who have undergone clinical grade visual screening to represent a diverse range of abnormal gait patterns, captured in various settings, including hospital clinics and urban uncontrolled outdoor environments. We demonstrate the validity of the dataset and utility of action recognition models for CGA using pretrained models Temporal Segment Networks(TSN) and SlowFast network to achieve video abnormality detection of 94% and 92% respectively when tested on GAVD dataset. A GitHub repository https://github.com/Rahmyyy/GAVD consisting of convenient URL links, and clinically relevant annotation for CGA is provided for over 450 online videos, featuring diverse subjects performing a range of normal, pathological, and abnormal gait patterns.

Via

Access Paper or Ask Questions

Orientation-conditioned Facial Texture Mapping for Video-based Facial Remote Photoplethysmography Estimation

Apr 16, 2024

Sam Cantrill, David Ahmedt-Aristizabal, Lars Petersson, Hanna Suominen, Mohammad Ali Armin

Abstract:Camera-based remote photoplethysmography (rPPG) enables contactless measurement of important physiological signals such as pulse rate (PR). However, dynamic and unconstrained subject motion introduces significant variability into the facial appearance in video, confounding the ability of video-based methods to accurately extract the rPPG signal. In this study, we leverage the 3D facial surface to construct a novel orientation-conditioned facial texture video representation which improves the motion robustness of existing video-based facial rPPG estimation methods. Our proposed method achieves a significant 18.2% performance improvement in cross-dataset testing on MMPD over our baseline using the PhysNet model trained on PURE, highlighting the efficacy and generalization benefits of our designed video representation. We demonstrate significant performance improvements of up to 29.6% in all tested motion scenarios in cross-dataset testing on MMPD, even in the presence of dynamic and unconstrained subject motion, emphasizing the benefits of disentangling motion through modeling the 3D facial surface for motion robust facial rPPG estimation. We validate the efficacy of our design decisions and the impact of different video processing steps through an ablation study. Our findings illustrate the potential strengths of exploiting the 3D facial surface as a general strategy for addressing dynamic and unconstrained subject motion in videos. The code is available at https://samcantrill.github.io/orientation-uv-rppg/.

* 12 pages, 8 figures, 6 tables; corrected abstract typo

Via

Access Paper or Ask Questions

Deep Learning Approaches for Seizure Video Analysis: A Review

Dec 18, 2023

David Ahmedt-Aristizabal, Mohammad Ali Armin, Zeeshan Hayder, Norberto Garcia-Cairasco, Lars Petersson, Clinton Fookes, Simon Denman, Aileen McGonigal

Figure 1 for Deep Learning Approaches for Seizure Video Analysis: A Review

Figure 2 for Deep Learning Approaches for Seizure Video Analysis: A Review

Figure 3 for Deep Learning Approaches for Seizure Video Analysis: A Review

Figure 4 for Deep Learning Approaches for Seizure Video Analysis: A Review

Abstract:Seizure events may manifest as transient disruptions in movement and behavior, and the analysis of these clinical signs, referred to as semiology, is subject to observer variations when specialists evaluate video-recorded events in the clinical setting. To enhance the accuracy and consistency of evaluations, computer-aided video analysis of seizures has emerged as a natural avenue. In the field of medical applications, deep learning and computer vision approaches have driven substantial advancements. Historically, these approaches have been used for disease detection, classification, and prediction using diagnostic data; however, there has been limited exploration of their application in evaluating video-based motion detection in the clinical epileptology setting. While vision-based technologies do not aim to replace clinical expertise, they can significantly contribute to medical decision-making and patient care by providing quantitative evidence and decision support. Behavior monitoring tools offer several advantages such as providing objective information, detecting challenging-to-observe events, reducing documentation efforts, and extending assessment capabilities to areas with limited expertise. In this paper, we detail the foundation technologies used in vision-based systems in the analysis of seizure videos, highlighting their success in semiology detection and analysis, focusing on work published in the last 7 years. We systematically present these methods and indicate how the adoption of deep learning for the analysis of video recordings of seizures could be approached. Additionally, we illustrate how existing technologies can be interconnected through an integrated system for video-based semiology analysis. Finally, we discuss challenges and research directions for future studies.

* Preprint submitted to Epilepsy & Behavior, NEWroscience 2023

Via

Access Paper or Ask Questions

Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis

Nov 27, 2023

Léo Lebrat, Rodrigo Santa Cruz, Remi Chierchia, Yulia Arzhaeva, Mohammad Ali Armin, Joshua Goldsmith, Jeremy Oorloff, Prithvi Reddy, Chuong Nguyen, Lars Petersson(+5 more)

Figure 1 for Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis

Figure 2 for Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis

Figure 3 for Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis

Figure 4 for Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis

Abstract:Wound management poses a significant challenge, particularly for bedridden patients and the elderly. Accurate diagnostic and healing monitoring can significantly benefit from modern image analysis, providing accurate and precise measurements of wounds. Despite several existing techniques, the shortage of expansive and diverse training datasets remains a significant obstacle to constructing machine learning-based frameworks. This paper introduces Syn3DWound, an open-source dataset of high-fidelity simulated wounds with 2D and 3D annotations. We propose baseline methods and a benchmarking framework for automated 3D morphometry analysis and 2D/3D wound segmentation.

Via

Access Paper or Ask Questions

A Hyperspectral and RGB Dataset for Building Facade Segmentation

Dec 06, 2022

Nariman Habili, Ernest Kwan, Weihao Li, Christfried Webers, Jeremy Oorloff, Mohammad Ali Armin, Lars Petersson

Abstract:Hyperspectral Imaging (HSI) provides detailed spectral information and has been utilised in many real-world applications. This work introduces an HSI dataset of building facades in a light industry environment with the aim of classifying different building materials in a scene. The dataset is called the Light Industrial Building HSI (LIB-HSI) dataset. This dataset consists of nine categories and 44 classes. In this study, we investigated deep learning based semantic segmentation algorithms on RGB and hyperspectral images to classify various building materials, such as timber, brick and concrete.

Via

Access Paper or Ask Questions

Automated Coronary Arteries Labeling Via Geometric Deep Learning

Dec 01, 2022

Yadan Li, Mohammad Ali Armin, Simon Denman, David Ahmedt-Aristizabal

Abstract:Automatic labelling of anatomical structures, such as coronary arteries, is critical for diagnosis, yet existing (non-deep learning) methods are limited by a reliance on prior topological knowledge of the expected tree-like structures. As the structure such vascular systems is often difficult to conceptualize, graph-based representations have become popular due to their ability to capture the geometric and topological properties of the morphology in an orientation-independent and abstract manner. However, graph-based learning for automated labeling of tree-like anatomical structures has received limited attention in the literature. The majority of prior studies have limitations in the entity graph construction, are dependent on topological structures, and have limited accuracy due to the anatomical variability between subjects. In this paper, we propose an intuitive graph representation method, well suited to use with 3D coordinate data obtained from angiography scans. We subsequently seek to analyze subject-specific graphs using geometric deep learning. The proposed models leverage expert annotated labels from 141 patients to learn representations of each coronary segment, while capturing the effects of anatomical variability within the training data. We investigate different variants of so-called message passing neural networks. Through extensive evaluations, our pipeline achieves a promising weighted F1-score of 0.805 for labeling coronary artery (13 classes) for a five-fold cross-validation. Considering the ability of graph models in dealing with irregular data, and their scalability for data segmentation, this work highlights the potential of such methods to provide quantitative evidence to support the decisions of medical experts.

Via

Access Paper or Ask Questions

Vision-Based Activity Recognition in Children with Autism-Related Behaviors

Aug 08, 2022

Pengbo Wei, David Ahmedt-Aristizabal, Harshala Gammulle, Simon Denman, Mohammad Ali Armin

Figure 1 for Vision-Based Activity Recognition in Children with Autism-Related Behaviors

Figure 2 for Vision-Based Activity Recognition in Children with Autism-Related Behaviors

Figure 3 for Vision-Based Activity Recognition in Children with Autism-Related Behaviors

Figure 4 for Vision-Based Activity Recognition in Children with Autism-Related Behaviors

Abstract:Advances in machine learning and contactless sensors have enabled the understanding complex human behaviors in a healthcare setting. In particular, several deep learning systems have been introduced to enable comprehensive analysis of neuro-developmental conditions such as Autism Spectrum Disorder (ASD). This condition affects children from their early developmental stages onwards, and diagnosis relies entirely on observing the child's behavior and detecting behavioral cues. However, the diagnosis process is time-consuming as it requires long-term behavior observation, and the scarce availability of specialists. We demonstrate the effect of a region-based computer vision system to help clinicians and parents analyze a child's behavior. For this purpose, we adopt and enhance a dataset for analyzing autism-related actions using videos of children captured in uncontrolled environments (e.g. videos collected with consumer-grade cameras, in varied environments). The data is pre-processed by detecting the target child in the video to reduce the impact of background noise. Motivated by the effectiveness of temporal convolutional models, we propose both light-weight and conventional models capable of extracting action features from video frames and classifying autism-related behaviors by analyzing the relationships between frames in a video. Through extensive evaluations on the feature extraction and learning strategies, we demonstrate that the best performance is achieved with an Inflated 3D Convnet and Multi-Stage Temporal Convolutional Networks, achieving a 0.83 Weighted F1-score for classification of the three autism-related actions, outperforming existing methods. We also propose a light-weight solution by employing the ESNet backbone within the same system, achieving competitive results of 0.71 Weighted F1-score, and enabling potential deployment on embedded systems.

Via

Access Paper or Ask Questions

You Only Cut Once: Boosting Data Augmentation with a Single Cut

Jan 28, 2022

Junlin Han, Pengfei Fang, Weihao Li, Jie Hong, Mohammad Ali Armin, Ian Reid, Lars Petersson, Hongdong Li

Figure 1 for You Only Cut Once: Boosting Data Augmentation with a Single Cut

Figure 2 for You Only Cut Once: Boosting Data Augmentation with a Single Cut

Figure 3 for You Only Cut Once: Boosting Data Augmentation with a Single Cut

Figure 4 for You Only Cut Once: Boosting Data Augmentation with a Single Cut

Abstract:We present You Only Cut Once (YOCO) for performing data augmentations. YOCO cuts one image into two pieces and performs data augmentations individually within each piece. Applying YOCO improves the diversity of the augmentation per sample and encourages neural networks to recognize objects from partial information. YOCO enjoys the properties of parameter-free, easy usage, and boosting almost all augmentations for free. Thorough experiments are conducted to evaluate its effectiveness. We first demonstrate that YOCO can be seamlessly applied to varying data augmentations, neural network architectures, and brings performance gains on CIFAR and ImageNet classification tasks, sometimes surpassing conventional image-level augmentation by large margins. Moreover, we show YOCO benefits contrastive pre-training toward a more powerful representation that can be better transferred to multiple downstream tasks. Finally, we study a number of variants of YOCO and empirically analyze the performance for respective settings. Code is available at GitHub.

* Code: https://github.com/JunlinHan/YOCO

Via

Access Paper or Ask Questions

In-Bed Human Pose Estimation from Unseen and Privacy-Preserving Image Domains

Nov 30, 2021

Ting Cao, Mohammad Ali Armin, Simon Denman, Lars Petersson, David Ahmedt-Aristizabal

Figure 1 for In-Bed Human Pose Estimation from Unseen and Privacy-Preserving Image Domains

Figure 2 for In-Bed Human Pose Estimation from Unseen and Privacy-Preserving Image Domains

Figure 3 for In-Bed Human Pose Estimation from Unseen and Privacy-Preserving Image Domains

Figure 4 for In-Bed Human Pose Estimation from Unseen and Privacy-Preserving Image Domains

Abstract:Medical applications have benefited from the rapid advancement in computer vision. For patient monitoring in particular, in-bed human posture estimation provides important health-related metrics with potential value in medical condition assessments. Despite great progress in this domain, it remains a challenging task due to substantial ambiguity during occlusions, and the lack of large corpora of manually labeled data for model training, particularly with domains such as thermal infrared imaging which are privacy-preserving, and thus of great interest. Motivated by the effectiveness of self-supervised methods in learning features directly from data, we propose a multi-modal conditional variational autoencoder (MC-VAE) capable of reconstructing features from missing modalities seen during training. This approach is used with HRNet to enable single modality inference for in-bed pose estimation. Through extensive evaluations, we demonstrate that body positions can be effectively recognized from the available modality, achieving on par results with baseline models that are highly dependent on having access to multiple modes at inference time. The proposed framework supports future research towards self-supervised learning that generates a robust model from a single source, and expects it to generalize over many unknown distributions in clinical environments.

Via

Access Paper or Ask Questions