Abstract:The automatic detection of gait anomalies can lead to systems that can be used for fall detection and prevention. In this paper, we present a gait anomaly detection system based on the Matrix Profile (MP) algorithm. The MP algorithm is exact, parameter free, simple and efficient, making it a perfect candidate for on the edge deployment. We propose a gait anomaly detection system that is able to adapt to an individual's gait pattern and successfully detect anomalous steps with short latency. To evaluate the system we record a small database of enacted anomalous steps. The results show the system outperforms a more complex Neural Network baseline.
Abstract:Schizophrenia and bipolar disorder are debilitating psychiatric illnesses that can be challenging to diagnose accurately. The similarities between the diseases make it difficult to differentiate between them using traditional diagnostic tools. Recently, resting-state functional magnetic resonance imaging (rsfMRI) has emerged as a promising tool for the diagnosis of psychiatric disorders. This paper presents several methods for differentiating schizophrenia and bipolar disorder based on features extracted from rsfMRI data. The system that achieved the best results, uses 1D Convolutional Neural Networks to analyze patterns of Intrinsic Connectivity time courses obtained from rsfMRI and potentially identify biomarkers that distinguish between the two disorders. We evaluate the system's performance on a large dataset of patients with schizophrenia and bipolar disorder and demonstrate that the system achieves a 0.7078 Area Under Curve (AUC) score in differentiating patients with these disorders. Our results suggest that rsfMRI-based classification systems have great potential for improving the accuracy of psychiatric diagnoses and may ultimately lead to more effective treatments for patients with this disorder.
Abstract:Speech technology is becoming ever more ubiquitous with the advance of speech enabled devices and services. The use of speech synthesis in Augmentative and Alternative Communication tools, has facilitated inclusion of individuals with speech impediments allowing them to communicate with their surroundings using speech. Although there are numerous speech synthesis systems for the most spoken world languages, there is still a limited offer for smaller languages. We propose and compare three models built using parametric and deep learning techniques for Macedonian trained on a newly recorded corpus. We target low-resource edge deployment for Augmentative and Alternative Communication and assistive technologies, such as communication boards and screen readers. The listening test results show that parametric speech synthesis is as performant compared to the more advanced deep learning models. Since it also requires less resources, and offers full speech rate and pitch control, it is the preferred choice for building a Macedonian TTS system for this application scenario.
Abstract:High-quality articulatory speech synthesis has many potential applications in speech science and technology. However, developing appropriate mappings from linguistic specification to articulatory gestures is difficult and time consuming. In this paper we construct an optimisation-based framework as a first step towards learning these mappings without manual intervention. We demonstrate the production of syllables with complex onsets and discuss the quality of the articulatory gestures with reference to coarticulation.
Abstract:Music transcription is the process of transcribing music audio into music notation. It is a field in which the machines still cannot beat human performance. The main motivation for automatic music transcription is to make it possible for anyone playing a musical instrument, to be able to generate the music notes for a piece of music quickly and accurately. It does not matter if the person is a beginner and simply struggles to find the music score by searching, or an expert who heard a live jazz improvisation and would like to reproduce it without losing time doing manual transcription. We propose Scorpiano -- a system that can automatically generate a music score for simple monophonic piano melody tracks using digital signal processing. The system integrates multiple digital audio processing methods: notes onset detection, tempo estimation, beat detection, pitch detection and finally generation of the music score. The system has proven to give good results for simple piano melodies, comparable to commercially available neural network based systems.
Abstract:The labelling of speech corpora is a laborious and time-consuming process. The ProsoBeast Annotation Tool seeks to ease and accelerate this process by providing an interactive 2D representation of the prosodic landscape of the data, in which contours are distributed based on their similarity. This interactive map allows the user to inspect and label the utterances. The tool integrates several state-of-the-art methods for dimensionality reduction and feature embedding, including variational autoencoders. The user can use these to find a good representation for their data. In addition, as most of these methods are stochastic, each can be used to generate an unlimited number of different prosodic maps. The web app then allows the user to seamlessly switch between these alternative representations in the annotation process. Experiments with a sample prosodically rich dataset have shown that the tool manages to find good representations of varied data and is helpful both for annotation and label correction. The tool is released as free software for use by the community.