Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Carlos Cancino-Chacón

Towards Musically Informed Evaluation of Piano Transcription Models

Jun 12, 2024

Patricia Hu, Lukáš Samuel Marták, Carlos Cancino-Chacón, Gerhard Widmer

Abstract:Automatic piano transcription models are typically evaluated using simple frame- or note-wise information retrieval (IR) metrics. Such benchmark metrics do not provide insights into the transcription quality of specific musical aspects such as articulation, dynamics, or rhythmic precision of the output, which are essential in the context of expressive performance analysis. Furthermore, in recent years, MAESTRO has become the de-facto training and evaluation dataset for such models. However, inference performance has been observed to deteriorate substantially when applied on out-of-distribution data, thereby questioning the suitability and reliability of transcribed outputs from such models for specific MIR tasks. In this work, we investigate the performance of three state-of-the-art piano transcription models in two experiments. In the first one, we propose a variety of musically informed evaluation metrics which, in contrast to the IR metrics, offer more detailed insight into the musical quality of the transcriptions. In the second experiment, we compare inference performance on real-world and perturbed audio recordings, and highlight musical dimensions which our metrics can help explain. Our experimental results highlight the weaknesses of existing piano transcription metrics and contribute to a more musically sound error analysis of transcription outputs.

Via

Access Paper or Ask Questions

The ACCompanion: Combining Reactivity, Robustness, and Musical Expressivity in an Automatic Piano Accompanist

Apr 24, 2023

Carlos Cancino-Chacón, Silvan Peter, Patricia Hu, Emmanouil Karystinaios, Florian Henkel, Francesco Foscarin, Nimrod Varga, Gerhard Widmer

Abstract:This paper introduces the ACCompanion, an expressive accompaniment system. Similarly to a musician who accompanies a soloist playing a given musical piece, our system can produce a human-like rendition of the accompaniment part that follows the soloist's choices in terms of tempo, dynamics, and articulation. The ACCompanion works in the symbolic domain, i.e., it needs a musical instrument capable of producing and playing MIDI data, with explicitly encoded onset, offset, and pitch for each played note. We describe the components that go into such a system, from real-time score following and prediction to expressive performance generation and online adaptation to the expressive choices of the human player. Based on our experience with repeated live demonstrations in front of various audiences, we offer an analysis of the challenges of combining these components into a system that is highly reactive and precise, while still a reliable musical partner, robust to possible performance errors and responsive to expressive variations.

* Accepted for the Arts and Creativity track at the 32nd International Joint Conference on Artificial Intelligence (IJCAI-23)

Via

Access Paper or Ask Questions

The match file format: Encoding Alignments between Scores and Performances

Jun 02, 2022

Francesco Foscarin, Emmanouil Karystinaios, Silvan David Peter, Carlos Cancino-Chacón, Maarten Grachten, Gerhard Widmer

Figure 1 for The match file format: Encoding Alignments between Scores and Performances

Figure 2 for The match file format: Encoding Alignments between Scores and Performances

Figure 3 for The match file format: Encoding Alignments between Scores and Performances

Abstract:This paper presents the specifications of match: a file format that extends a MIDI human performance with note-, beat-, and downbeat-level alignments to a corresponding musical score. This enables advanced analyses of the performance that are relevant for various tasks, such as expressive performance modeling, score following, music transcription, and performer classification. The match file includes a set of score-related descriptors that makes it usable also as a bare-bones score representation. For applications that require the use of structural score elements (e.g., voices, parts, beams, slurs), the match file can be easily combined with the symbolic score. To support the practical application of our work, we release a corrected and upgraded version of the Vienna4x22 dataset of scores and performances aligned with match files.

Via

Access Paper or Ask Questions

Partitura: A Python Package for Symbolic Music Processing

Jun 02, 2022

Carlos Cancino-Chacón, Silvan David Peter, Emmanouil Karystinaios, Francesco Foscarin, Maarten Grachten, Gerhard Widmer

Figure 1 for Partitura: A Python Package for Symbolic Music Processing

Figure 2 for Partitura: A Python Package for Symbolic Music Processing

Figure 3 for Partitura: A Python Package for Symbolic Music Processing

Figure 4 for Partitura: A Python Package for Symbolic Music Processing

Abstract:Partitura is a lightweight Python package for handling symbolic musical information. It provides easy access to features commonly used in music information retrieval tasks, like note arrays (lists of timed pitched events) and 2D piano roll matrices, as well as other score elements such as time and key signatures, performance directives, and repeat structures. Partitura can load musical scores (in MEI, MusicXML, Kern, and MIDI formats), MIDI performances, and score-to-performance alignments. The package includes some tools for music analysis, such as automatic pitch spelling, key signature identification, and voice separation. Partitura is an open-source project and is available at https://github.com/CPJKU/partitura/.

Via

Access Paper or Ask Questions

Beyond synchronization: Body gestures and gaze direction in duo performance

Jan 31, 2022

Laura Bishop, Carlos Cancino-Chacón, Werner Goebl

Abstract:In this chapter, we focus on two main categories of visual interaction: body gestures and gaze direction. Our focus on body gestures is motivated by research showing that gesture patterns often change during joint action tasks to become more predictable (van der Wel et al., 2016). Moreover, coordination sometimes emerges between musicians at the level of body sway (Chang et al., 2017). Our focus on gaze direction was motivated by the fact that gaze can serve simultaneously as a means of obtaining information about the world and as a means of communicating one's own attention and intent.

* Please cite as: Bishop, L., Cancino-Chac\'on, C., & Goebl, W. (2021). Beyond synchronization: Body gestures and gaze direction in duo performance. In Timmers, R., Bailes, F., and Daffern, H. (Eds.), Together in Music: Participation, Co-Ordination, and Creativity in Ensembles. Oxford: Oxford University. This version is a preprint of the chapter prepared by the authors

Via

Access Paper or Ask Questions

partitura: A Python Package for Handling Symbolic Musical Data

Jan 31, 2022

Maarten Grachten, Carlos Cancino-Chacón, Thassilo Gadermaier

Abstract:This demo paper introduces partitura, a Python package for handling symbolic musical information. The principal aim of this package is to handle richly structured musical information as conveyed by modern staff music notation. It provides a much wider range of possibilities to deal with music than the more reductive (but very common) piano roll-oriented approach inspired by the MIDI standard. The package is an open source project and is available on GitHub.

* This preprint is a slightly updated and reformatted version of the work presented at the Late Breaking/Demo Session of the 20th International Society for Music Information Retrieval Conference (ISMIR 2019), Delft, The Netherlands

Via

Access Paper or Ask Questions

A Convolutional Approach to Melody Line Identification in Symbolic Scores

Jun 24, 2019

Federico Simonetta, Carlos Cancino-Chacón, Stavros Ntalampiras, Gerhard Widmer

Figure 1 for A Convolutional Approach to Melody Line Identification in Symbolic Scores

Figure 2 for A Convolutional Approach to Melody Line Identification in Symbolic Scores

Figure 3 for A Convolutional Approach to Melody Line Identification in Symbolic Scores

Figure 4 for A Convolutional Approach to Melody Line Identification in Symbolic Scores

Abstract:In many musical traditions, the melody line is of primary significance in a piece. Human listeners can readily distinguish melodies from accompaniment; however, making this distinction given only the written score -- i.e. without listening to the music performed -- can be a difficult task. Solving this task is of great importance for both Music Information Retrieval and musicological applications. In this paper, we propose an automated approach to identifying the most salient melody line in a symbolic score. The backbone of the method consists of a convolutional neural network (CNN) estimating the probability that each note in the score (more precisely: each pixel in a piano roll encoding of the score) belongs to the melody line. We train and evaluate the method on various datasets, using manual annotations where available and solo instrument parts where not. We also propose a method to inspect the CNN and to analyze the influence exerted by notes on the prediction of other notes; this method can be applied whenever the output of a neural network has the same size as the input.

* In Proceedings of 20th International Society for Music Information Retrieval Conference

Via

Access Paper or Ask Questions

User Curated Shaping of Expressive Performances

Jun 14, 2019

Zhengshan Shi, Carlos Cancino-Chacón, Gerhard Widmer

Figure 1 for User Curated Shaping of Expressive Performances

Abstract:Musicians produce individualized, expressive performances by manipulating parameters such as dynamics, tempo and articulation. This manipulation of expressive parameters is informed by elements of score information such as pitch, meter, and tempo and dynamics markings (among others). In this paper we present an interactive interface that gives users the opportunity to explore the relationship between structural elements of a score and expressive parameters. This interface draws on the basis function models, a data-driven framework for expressive performance. In this framework, expressive parameters are modeled as a function of score features, i.e., numerical encodings of specific aspects of a musical score, using neural networks. With the proposed interface, users are able to weight the contribution of individual score features and understand how an expressive performance is constructed.

* 4 pages, ICML 2019 Machine Learning for Music Discovery Workshop

Via

Access Paper or Ask Questions

What were you expecting? Using Expectancy Features to Predict Expressive Performances of Classical Piano Music

Sep 11, 2017

Carlos Cancino-Chacón, Maarten Grachten, David R. W. Sears, Gerhard Widmer

Figure 1 for What were you expecting? Using Expectancy Features to Predict Expressive Performances of Classical Piano Music

Figure 2 for What were you expecting? Using Expectancy Features to Predict Expressive Performances of Classical Piano Music

Abstract:In this paper we present preliminary work examining the relationship between the formation of expectations and the realization of musical performances, paying particular attention to expressive tempo and dynamics. To compute features that reflect what a listener is expecting to hear, we employ a computational model of auditory expectation called the Information Dynamics of Music model (IDyOM). We then explore how well these expectancy features -- when combined with score descriptors using the Basis-Function modeling approach -- can predict expressive tempo and dynamics in a dataset of Mozart piano sonata performances. Our results suggest that using expectancy features significantly improves the predictions for tempo.

* 6 pages, 1 figure, 10th International Workshop on Machine Learning and Music (MML 2017)

Via

Access Paper or Ask Questions

From Bach to the Beatles: The simulation of human tonal expectation using ecologically-trained predictive models

Jul 19, 2017

Carlos Cancino-Chacón, Maarten Grachten, Kat Agres

Figure 1 for From Bach to the Beatles: The simulation of human tonal expectation using ecologically-trained predictive models

Figure 2 for From Bach to the Beatles: The simulation of human tonal expectation using ecologically-trained predictive models

Figure 3 for From Bach to the Beatles: The simulation of human tonal expectation using ecologically-trained predictive models

Abstract:Tonal structure is in part conveyed by statistical regularities between musical events, and research has shown that computational models reflect tonal structure in music by capturing these regularities in schematic constructs like pitch histograms. Of the few studies that model the acquisition of perceptual learning from musical data, most have employed self-organizing models that learn a topology of static descriptions of musical contexts. Also, the stimuli used to train these models are often symbolic rather than acoustically faithful representations of musical material. In this work we investigate whether sequential predictive models of musical memory (specifically, recurrent neural networks), trained on audio from commercial CD recordings, induce tonal knowledge in a similar manner to listeners (as shown in behavioral studies in music perception). Our experiments indicate that various types of recurrent neural networks produce musical expectations that clearly convey tonal structure. Furthermore, the results imply that although implicit knowledge of tonal structure is a necessary condition for accurate musical expectation, the most accurate predictive models also use other cues beyond the tonal structure of the musical context.

* In Proceedings of the 18th International Society of Music Information Retrieval Conference (ISMIR 2017)

Via

Access Paper or Ask Questions