Abstract:Autism spectrum disorder (ASD) is characterized by significant challenges in social interaction and comprehending communication signals. Recently, therapeutic interventions for ASD have increasingly utilized Deep learning powered-computer vision techniques to monitor individual progress over time. These models are trained on private, non-public datasets from the autism community, creating challenges in comparing results across different models due to privacy-preserving data-sharing issues. This work introduces MMASD+. MMASD+ consists of diverse data modalities, including 3D-Skeleton, 3D Body Mesh, and Optical Flow data. It integrates the capabilities of Yolov8 and Deep SORT algorithms to distinguish between the therapist and children, addressing a significant barrier in the original dataset. Additionally, a Multimodal Transformer framework is proposed to predict 11 action types and the presence of ASD. This framework achieves an accuracy of 95.03% for predicting action types and 96.42% for predicting ASD presence, demonstrating over a 10% improvement compared to models trained on single data modalities. These findings highlight the advantages of integrating multiple data modalities within the Multimodal Transformer framework.
Abstract:Autism spectrum disorder (ASD) is a developmental disorder characterized by significant social communication impairments and difficulties perceiving and presenting communication cues. Machine learning techniques have been broadly adopted to facilitate autism studies and assessments. However, computational models are primarily concentrated on specific analysis and validated on private datasets in the autism community, which limits comparisons across models due to privacy-preserving data sharing complications. This work presents a novel privacy-preserving open-source dataset, MMASD as a MultiModal ASD benchmark dataset, collected from play therapy interventions of children with Autism. MMASD includes data from 32 children with ASD, and 1,315 data samples segmented from over 100 hours of intervention recordings. To promote public access, each data sample consists of four privacy-preserving modalities of data: (1) optical flow, (2) 2D skeleton, (3) 3D skeleton, and (4) clinician ASD evaluation scores of children, e.g., ADOS scores. MMASD aims to assist researchers and therapists in understanding children's cognitive status, monitoring their progress during therapy, and customizing the treatment plan accordingly. It also has inspiration for downstream tasks such as action quality assessment and interpersonal synchrony estimation. MMASD dataset can be easily accessed at https://github.com/Li-Jicheng/MMASD-A-Multimodal-Dataset-for-Autism-Intervention-Analysis.
Abstract:Stroke patients often experience upper limb impairments that restrict their mobility and daily activities. Physical therapy (PT) is the most effective method to improve impairments, but low patient adherence and participation in PT exercises pose significant challenges. To overcome these barriers, a combination of virtual reality (VR) and robotics in PT is promising. However, few systems effectively integrate VR with robotics, especially for upper limb rehabilitation. Additionally, traditional VR rehabilitation primarily focuses on hand movements rather than joint movements of the limb. This work introduces a new virtual rehabilitation solution that combines VR with KinArm robotics and a wearable elbow sensor to measure elbow joint movements. The framework also enhances the capabilities of a traditional robotic device (KinArm) used for motor dysfunction assessment and rehabilitation. A preliminary study with non-clinical participants (n = 16) was conducted to evaluate the effectiveness and usability of the proposed VR framework. We used a two-way repeated measures experimental design where participants performed two tasks (Circle and Diamond) with two conditions (VR and VR KinArm). We found no main effect of the conditions for task completion time. However, there were significant differences in both the normalized number of mistakes and recorded elbow joint angles (captured as resistance change values from the wearable sensor) between the Circle and Diamond tasks. Additionally, we report the system usability, task load, and presence in the proposed VR framework. This system demonstrates the potential advantages of an immersive, multi-sensory approach and provides future avenues for research in developing more cost-effective, tailored, and personalized upper limb solutions for home therapy applications.