centre de ressources

Capturing Musical Prosody Through Interactive Audio/Visual Annotations

video

archive audiovisuelle
information
Œuvre de
intervenants
recommandations

Vous constatez une erreur ?

informations

évènements: Soutenance de thèse de Daniel Bedoya
Type: Soutenance de thèse/HDR
Lieu de représentation: Ircam, Salle Igor-Stravinsky (Paris)
durée: 44 min
date: 18 octobre 2023

Soutenance de thèse de Daniel Bedoya

Daniel Bedoya, a doctoral student at Sorbonne University and beneficiary of an EDITE contract, completed his thesis on "Capturing musical prosody through interactive audiovisual annotations" within the Musical Representations team of the STMS Laboratory (Ircam-Sorbonne University-CNRS-Ministère de la Culture). His thesis was funded by the ERC project COSMOS (Computational Shaping and Modeling of Musical Structures) within the STMS Laboratory and by the ATER contract at the CNAM, LMSSC Laboratory.
The defence takes place at Ircam, in English, before a jury comprising :

Roberto Bresin, KTH Royal Institute of Technology (Sweden), Rapporteur
Pierre Couprie, University of Paris-Saclay, Rapporteur
Jean-Julien Aucouturier, FEMTO-ST, Examiner
Louis Bigo, Université de Lille, Examiner
Muki Haklay, University College London, United Kingdom, Examiner
Anja Volk, University of Utrecht, Netherlands, Examiner
Carlos Agón, Sorbonne University, Director
Elaine Chew, King's College London, United Kingdom, Co-Director

Abstract:

Participatory science (PS) projects have stimulated research in several disciplines in recent years. Citizen scientists contribute to this research by performing cognitive tasks, fostering learning, innovation and inclusion. Although crowdsourcing has been used to collect structural annotations in music, MS remains underused to study musical expressivity.
We introduce a new annotation protocol to capture musical prosody, associated with the acoustic variations introduced by performers to make music expressive. Our top-down, human-centred method prioritises the listener in producing annotations of the prosodic features of music. We focus on segmentation and prominence, which convey structure and affect. This protocol provides an MS framework and an experimental approach for conducting systematic and scalable studies.
We implement this annotation protocol in CosmoNote, a customizable web-based software designed to facilitate the annotation of expressive musical structures. CosmoNote allows users to interact with visual layers, including the waveform, recorded notes, extracted audio attributes and score features. We can place borders of different levels, regions, comments and groups of notes.
We have conducted two studies to improve the protocol and the platform. The first examines the impact of simultaneous auditory and visual stimuli on segmentation boundaries. We compare the differences in boundary distributions derived from intermodal (auditory and visual) and unimodal (auditory or visual) information. The distances between the unimodal-visual and intermodal distributions are smaller than between the unimodal-auditory and intermodal distributions. We show that the addition of visuals accentuates key information and provides cognitive scaffolding that helps to clearly mark prosodic boundaries, although they may distract attention from specific structures. Conversely, without audio, the annotation task becomes difficult, masking subtle cues. Despite their exaggeration or inaccuracy, visual cues are essential for guiding border annotations in interpretation, improving overall results.
The second study uses all types of CosmoNote annotations and analyses how participants annotate musical prosody, with minimal or detailed instructions, in a free annotation setting. We compare the quality of annotations between musicians and non-musicians. We evaluate the MS component in an ecological setting where participants are completely autonomous in a task where time, attention and patience are valued. We present three methods based on annotation labels, categories and common properties to analyse and aggregate the data. The results show convergence in the types of annotations and descriptions used to mark recurrent musical elements, for any experimental condition and musical ability. We propose strategies for improving the protocol, data aggregation and analysis in large-scale applications.
This thesis enriches the representation and understanding of structures in interpreted music by introducing an annotation protocol and platform, adaptable experiments, and aggregation and analysis methods. We demonstrate the importance of the trade-off between obtaining data that is simpler to analyse and richer content that captures complex musical thinking. Our protocol can be generalised to studies of performance decisions to improve understanding of expressive choices in musical performance.

intervenants

Daniel Bedoya

conférencier·ère

Les médias liés à cet évènement

Capturing Musical Prosody Through Interactive Audio/Visual Annotations - Questions from the panel - Daniel Bedoya

Vidéo

18 octobre 2023 01 h 45 min

Vidéo

Vous constatez une erreur ?

IRCAM

1, place Igor-Stravinsky
75004 Paris
+33 1 44 78 48 43

heures d'ouverture

Du lundi au vendredi de 9h30 à 19h
Fermé le samedi et le dimanche

accès en transports

Hôtel de Ville, Rambuteau, Châtelet, Les Halles

Institut de Recherche et de Coordination Acoustique/Musique

Capturing Musical Prosody Through Interactive Audio/Visual Annotations

Capturing Musical Prosody Through Interactive Audio/Visual Annotations

informations

Soutenance de thèse de Daniel Bedoya

intervenants

Les médias liés à cet évènement

Capturing Musical Prosody Through Interactive Audio/Visual Annotations - Questions from the panel - Daniel Bedoya

partager

IRCAM

heures d'ouverture

accès en transports