centre de ressources

Soutenance de thèse de Antoine Caillon

video

archive audiovisuelle
information
Œuvre de
intervenants
recommandations

Vous constatez une erreur ?

informations

évènements: Soutenance de thèse d'Antoine Caillon
Type: Soutenance de thèse/HDR
Lieu de représentation: Ircam, Salle Igor-Stravinsky (Paris)
durée: 47 min
date: 21 février 2023

Soutenance de thèse d'Antoine Caillon

Antoine Caillon, doctorant de Sorbonne Université, soutient sa thèse "Apprentissage temporel hiérarchique pour la synthèse audio neuronale de la musique" menée dans l'équipe Représentations Musicales du laboratoire Ircam STMS sous la direction de Jean Bresson et Philippe Esling.

Son jury sera composé de :

Simon Colton Rapporteur - Queen Mary University of London (Royaume-Uni)
Bob Sturm Rapporteur - Royal institute of technology (Suède)
Michèle Sebag Examinateur - Université Paris Saclay
Patrick Gallinari Examinateur - Sorbonne Université
Mark Sandler Examinateur - Queen Mary University of London (Royaume-Uni)
Jean Bresson Directeur de thèse - Sorbonne Université

Philippe Esling Co-directeur de thèse et encadrant - Sorbonne Université
Abstract
Recent advances in deep learning have offered new ways to build models addressing a wide variety of tasks through the optimization of a set of parameters based on minimizing a cost function. Amongst these techniques, probabilistic generative models have yielded impressive advances in text, image and sound generation. However, musical audio signal generation remains a challenging problem. In this thesis, we study how a hierarchical approach to audio modeling can address the musical signal modeling task, while offering different levels of control to the user. Our main hypothesis is that extracting different representation levels of an audio signal allows to abstract the complexity of lower levels for each modeling stage. This would eventually allow the use of lightweight architectures, each modeling a single audio scale. We start by addressing raw audio modeling by proposing an audio model combining Variational Auto Encoders and Generative Adversarial Networks, yielding high-quality 48kHz neural audio synthesis, while being 20 times faster than real time on CPU. Then, we study how autoregressive models can be used to understand the temporal behavior of the representation yielded by this low-level audio model, using optional additional conditioning signals such as acoustic descriptors or tempo. Finally, we propose a method for using all the proposed models directly on audio streams, allowing their use in realtime applications that we developed during this thesis.

Les médias liés à cet évènement

Soutenance de thèse d'Antoine Caillon - Question du jury

Vidéo

21 février 2023 01 h 06 min

Vidéo

Vous constatez une erreur ?

IRCAM

1, place Igor-Stravinsky
75004 Paris
+33 1 44 78 48 43

heures d'ouverture

Du lundi au vendredi de 9h30 à 19h
Fermé le samedi et le dimanche

accès en transports

Hôtel de Ville, Rambuteau, Châtelet, Les Halles

Institut de Recherche et de Coordination Acoustique/Musique

Soutenance de thèse de Antoine Caillon

Soutenance de thèse de Antoine Caillon

informations

Soutenance de thèse d'Antoine Caillon

Les médias liés à cet évènement

Soutenance de thèse d'Antoine Caillon - Question du jury

partager

IRCAM

heures d'ouverture

accès en transports