November 16, 2023 53 min
November 16, 2023 51 min
November 16, 2023 05 min
November 16, 2023 01 h 04 min
November 7, 2024
November 7, 2024
November 7, 2024
November 7, 2024
November 7, 2024
November 7, 2024
November 16, 2023 53 min
November 16, 2023 51 min
November 16, 2023 05 min
November 16, 2023 01 h 04 min
0:00/0:00
"Basic-pitch" is a lightweight neural network for musical instrument transcription, which supports polyphonic outputs and generalizes to a wide variety of instruments (including vocals). In this talk, we will discuss how we built and evaluated this efficient and simple model, which experimentally showed to be substantially better than a comparable baseline in detecting notes. The model is trained to jointly predict frame-wise onsets, multi-pitch and note activations, and we experimentally showed that this multi-output structure improves the resulting frame-level note accuracy. We will also listen to examples using (and misusing) this model for creative purposes, using our open-source python library, or demo website: thanks to its scalability, the model can run on the browser, and your audio doesn't even leave your own computer.
Paper: https://arxiv.org/abs/2203.09893
Code: https://github.com/spotify/basic-pitch
Demo: https://basicpitch.spotify.com