Abstract:Automated evaluation of movement quality holds significant potential for enhancing physiotherapeutic treatments and sports training by providing objective, real-time feedback. However, the effectiveness of deep learning models in assessing movements captured by inertial measurement units (IMUs) is often hampered by limited data availability, class imbalance, and label ambiguity. In this work, we present a novel data augmentation method that generates realistic IMU data using musculoskeletal simulations integrated with systematic modifications of movement trajectories. Crucially, our approach ensures biomechanical plausibility and allows for automatic, reliable labeling by combining inverse kinematic parameters with a knowledge-based evaluation strategy. Extensive evaluations demonstrate that augmented variants closely resembles real-world data, significantly improving the classification accuracy and generalization capability of neural network models. Additionally, we highlight the benefits of augmented data for patient-specific fine-tuning scenarios, particularly when only limited subject-specific training examples are available. Our findings underline the practicality and efficacy of this augmentation method in overcoming common challenges faced by deep learning applications in physiotherapeutic exercise evaluation.
Abstract:Denoising diffusion probabilistic models are able to generate synthetic sensor signals. The training process of such a model is controlled by a loss function which measures the difference between the noise that was added in the forward process and the noise that was predicted by the diffusion model. This enables the generation of realistic data. However, the randomness within the process and the loss function itself makes it difficult to estimate the quality of the data. Therefore, we examine multiple similarity metrics and adapt an existing metric to overcome this issue by monitoring the training and synthetisation process using those metrics. The adapted metric can even be fine-tuned on the input data to comply with the requirements of an underlying classification task. We were able to significantly reduce the amount of training epochs without a performance reduction in the classification task. An optimized training process not only saves resources, but also reduces the time for training generative models.
Abstract:Kinematic sensors are often used to analyze movement behaviors in sports and daily activities due to their ease of use and lack of spatial restrictions, unlike video-based motion capturing systems. Still, the generation, and especially the labeling of motion data for specific activities can be time-consuming and costly. Additionally, many models struggle with limited data, which limits their performance in recognizing complex movement patterns. To address those issues, generating synthetic data can help expand the diversity and variability. In this work, we propose IMUDiffusion, a probabilistic diffusion model specifically designed for multivariate time series generation. Our approach enables the generation of high-quality time series sequences which accurately capture the dynamics of human activities. Moreover, by joining our dataset with synthetic data, we achieve a significant improvement in the performance of our baseline human activity classifier. In some cases, we are able to improve the macro F1-score by almost 30%. IMUDiffusion provides a valuable tool for generating realistic human activity movements and enhance the robustness of models in scenarios with limited training data.
Abstract:The extraction of text in high quality is essential for text-based document analysis tasks like Document Classification or Named Entity Recognition. Unfortunately, this is not always ensured, as poor scan quality and the resulting artifacts lead to errors in the Optical Character Recognition (OCR) process. Current approaches using Convolutional Neural Networks show promising results for background removal tasks but fail correcting artifacts like pixelation or compression errors. For general images, Transformer backbones are getting integrated more frequently in well-known neural network structures for denoising tasks. In this work, a modified UNet structure using a Swin Transformer backbone is presented to remove typical artifacts in scanned documents. Multi-headed cross-attention skip connections are used to more selectively learn features in respective levels of abstraction. The performance of this approach is examined regarding compression errors, pixelation and random noise. An improvement in text extraction quality with a reduced error rate of up to 53.9% on the synthetic data is archived. The pretrained base-model can be easily adapted to new artifacts. The cross-attention skip connections allow to integrate textual information extracted from the encoder or in form of commands to more selectively control the models outcome. The latter is shown by means of an example application.
Abstract:To improve the understanding of human gait and to facilitate novel developments in gait rehabilitation, the neural correlates of human gait as measured by means of non-invasive electroencephalography (EEG) have been investigated recently. Particularly, gait-related event-related brain potentials (gERPs) may provide information about the functional role of cortical brain regions in human gait control. The purpose of this paper is to explore possible experimental and technical solutions for time-sensitive analysis of human gait-related ERPs during spontaneous and instructed treadmill walking. A solution (HW/SW) for synchronous recording of gait- and EEG data was developed, tested and piloted. The solution consists of a custom-made USB synchronization interface, a time-synchronization module and a data merging module, allowing temporal synchronization of recording devices for time-sensitive extraction of gait markers for analysis of gait-related ERPs and for the training of artificial neural networks. In the present manuscript, the hardware and software components were tested with the following devices: A treadmill with an integrated pressure plate for gait analysis (zebris FDM-T) and an Acticap non-wireless 32-channel EEG-system (Brain Products GmbH). The usability and validity of the developed solution was tested in a pilot study (n = 3 healthy participants, n=3 females, mean age = 22.75 years). Recorded EEG data was segmented and analyzed according to the detected gait markers for the analysis of gait-related ERPs. Finally, EEG periods were used to train a deep learning artificial neural network as classifier of gait phases. The results obtained in this pilot study, although preliminary, support the feasibility of the solution for the application of gait-related EEG analysis..