Queen Mary University of London
Abstract:This paper presents Tidal-MerzA, a novel system designed for collaborative performances between humans and a machine agent in the context of live coding, specifically focusing on the generation of musical patterns. Tidal-MerzA fuses two foundational models: ALCAA (Affective Live Coding Autonomous Agent) and Tidal Fuzz, a computational framework. By integrating affective modelling with computational generation, this system leverages reinforcement learning techniques to dynamically adapt music composition parameters within the TidalCycles framework, ensuring both affective qualities to the patterns and syntactical correctness. The development of Tidal-MerzA introduces two distinct agents: one focusing on the generation of mini-notation strings for musical expression, and another on the alignment of music with targeted affective states through reinforcement learning. This approach enhances the adaptability and creative potential of live coding practices and allows exploration of human-machine creative interactions. Tidal-MerzA advances the field of computational music generation, presenting a novel methodology for incorporating artificial intelligence into artistic practices.
Abstract:This paper introduces a novel approach to predicting periodic time series using reservoir computing. The model is tailored to deliver precise forecasts of rhythms, a crucial aspect for tasks such as generating musical rhythm. Leveraging reservoir computing, our proposed method is ultimately oriented towards predicting human perception of rhythm. Our network accurately predicts rhythmic signals within the human frequency perception range. The model architecture incorporates primary and intermediate neurons tasked with capturing and transmitting rhythmic information. Two parameter matrices, denoted as c and k, regulate the reservoir's overall dynamics. We propose a loss function to adapt c post-training and introduce a dynamic selection (DS) mechanism that adjusts $k$ to focus on areas with outstanding contributions. Experimental results on a diverse test set showcase accurate predictions, further improved through real-time tuning of the reservoir via c and k. Comparative assessments highlight its superior performance compared to conventional models.
Abstract:Formalizing creativity-related concepts has been a long-term goal of Computational Creativity. To the same end, we explore Formal Learning Theory in the context of creativity. We provide an introduction to the main concepts of this framework and a re-interpretation of terms commonly found in creativity discussions, proposing formal definitions for novelty and transformational creativity. This formalisation marks the beginning of a research branch we call Formal Creativity Theory, exploring how learning can be included as preparation for exploratory behaviour and how learning is a key part of transformational creative behaviour. By employing these definitions, we argue that, while novelty is neither necessary nor sufficient for transformational creativity in general, when using an inspiring set, rather than a sequence of experiences, an agent actually requires novelty for transformational creativity to occur.
Abstract:This paper presents a comprehensive study of automatic performer identification in expressive piano performances using convolutional neural networks (CNNs) and expressive features. Our work addresses the challenging multi-class classification task of identifying virtuoso pianists, which has substantial implications for building dynamic musical instruments with intelligence and smart musical systems. Incorporating recent advancements, we leveraged large-scale expressive piano performance datasets and deep learning techniques. We refined the scores by expanding repetitions and ornaments for more accurate feature extraction. We demonstrated the capability of one-dimensional CNNs for identifying pianists based on expressive features and analyzed the impact of the input sequence lengths and different features. The proposed model outperforms the baseline, achieving 85.3% accuracy in a 6-way identification task. Our refined dataset proved more apt for training a robust pianist identifier, making a substantial contribution to the field of automatic performer identification. Our codes have been released at https://github.com/BetsyTang/PID-CNN.
Abstract:Capturing intricate and subtle variations in human expressiveness in music performance using computational approaches is challenging. In this paper, we propose a novel approach for reconstructing human expressiveness in piano performance with a multi-layer bi-directional Transformer encoder. To address the needs for large amounts of accurately captured and score-aligned performance data in training neural networks, we use transcribed scores obtained from an existing transcription model to train our model. We integrate pianist identities to control the sampling process and explore the ability of our system to model variations in expressiveness for different pianists. The system is evaluated through statistical analysis of generated expressive performances and a listening test. Overall, the results suggest that our method achieves state-of-the-art in generating human-like piano performances from transcribed scores, while fully and consistently reconstructing human expressiveness poses further challenges.
Abstract:Human language, music and a variety of animal vocalisations constitute ways of sonic communication that exhibit remarkable structural complexity. While the complexities of language and possible parallels in animal communication have been discussed intensively, reflections on the complexity of music and animal song, and their comparisons are underrepresented. In some ways, music and animal songs are more comparable to each other than to language, as propositional semantics cannot be used as as indicator of communicative success or well-formedness, and notions of grammaticality are less easily defined. This review brings together accounts of the principles of structure building in language, music and animal song, relating them to the corresponding models in formal language theory, with a special focus on evaluating the benefits of using the Chomsky hierarchy (CH). We further discuss common misunderstandings and shortcomings concerning the CH, as well as extensions or augmentations of it that address some of these issues, and suggest ways to move beyond.
Abstract:This paper presents a geometric approach to the problem of modelling the relationship between words and concepts, focusing in particular on analogical phenomena in language and cognition. Grounded in recent theories regarding geometric conceptual spaces, we begin with an analysis of existing static distributional semantic models and move on to an exploration of a dynamic approach to using high dimensional spaces of word meaning to project subspaces where analogies can potentially be solved in an online, contextualised way. The crucial element of this analysis is the positioning of statistics in a geometric environment replete with opportunities for interpretation.