Abstract:The ethics of AI as both material and medium for interaction remains in murky waters within the context of musical and artistic practice. The interdisciplinarity of the field is revealing matters of concern and care, which necessitate interdisciplinary methodologies for evaluation to trouble and critique the inheritance of "residue-laden" AI-tools in musical applications. Seeking to unsettle these murky waters, this paper critically examines the example of Holly+, a deep neural network that generates raw audio in the likeness of its creator Holly Herndon. Drawing from theoretical concerns and considerations from speculative feminism and care ethics, we care-fully trouble the structures, frameworks and assumptions that oscillate within and around Holly+. We contribute with several considerations and contemplate future directions for integrating speculative feminism and care into musical-AI agent and system design, derived from our critical feminist examination.
Abstract:The explosion of content generated by Artificial Intelligence models has initiated a cultural shift in arts, music, and media, where roles are changing, values are shifting, and conventions are challenged. The readily available, vast dataset of the internet has created an environment for AI models to be trained on any content on the web. With AI models shared openly, and used by many, globally, how does this new paradigm shift challenge the status quo in artistic practices? What kind of changes will AI technology bring into music, arts, and new media?
Abstract:The research in Deep Learning applications in sound and music computing have gathered an interest in the recent years; however, there is still a missing link between these new technologies and on how they can be incorporated into real-world artistic practices. In this work, we explore a well-known Deep Learning architecture called Variational Autoencoders (VAEs). These architectures have been used in many areas for generating latent spaces where data points are organized so that similar data points locate closer to each other. Previously, VAEs have been used for generating latent timbre spaces or latent spaces of symbolic music excepts. Applying VAE to audio features of timbre requires a vocoder to transform the timbre generated by the network to an audio signal, which is computationally expensive. In this work, we apply VAEs to raw audio data directly while bypassing audio feature extraction. This approach allows the practitioners to use any audio recording while giving flexibility and control over the aesthetics through dataset curation. The lower computation time in audio signal generation allows the raw audio approach to be incorporated into real-time applications. In this work, we propose three strategies to explore latent spaces of audio and timbre for sound design applications. By doing so, our aim is to initiate a conversation on artistic approaches and strategies to utilize latent audio spaces in sound and music practices.