• Saison 2023-2024 - None - None > Réunion du Groupement de Recherche "Information Signal Image viSion" : Traitement du signal pour l'audio et l'écoute artificielle - Musique
  • Nov. 16, 2023
  • Ircam, Paris
Participants
  • Rachel Bittner (conférencière)

"Basic-pitch" is a lightweight neural network for musical instrument transcription, which supports polyphonic outputs and generalizes to a wide variety of instruments (including vocals). In this talk, we will discuss how we built and evaluated this efficient and simple model, which experimentally showed to be substantially better than a comparable baseline in detecting notes. The model is trained to jointly predict frame-wise onsets, multi-pitch and note activations, and we experimentally showed that this multi-output structure improves the resulting frame-level note accuracy. We will also listen to examples using (and misusing) this model for creative purposes, using our open-source python library, or demo website: thanks to its scalability, the model can run on the browser, and your audio doesn't even leave your own computer.

Paper: https://arxiv.org/abs/2203.09893
Code: https://github.com/spotify/basic-pitch
Demo: https://basicpitch.spotify.com

Réunion du Groupement de Recherche "Information Signal Image viSion" : Traitement du signal pour l'audio et l'écoute artificielle - Musique

From the same archive