Do you notice a mistake?
NaN:NaN
00:00
La soutenance de thèse se fera devant un jury composé de :
Prof. Josh McDermott - Lab. for Computational Audition, MIT (Rapporteur, via Skype)
Prof. Shlomo Dubnov - Dep. of Music, UCSD - (Rapporteur, via Skype)
Prof. Laurent Daudet - Institut Langevin, Diderot University Paris - Examiner
Prof. Bruno Gas - ISIR, UPMC - Examiner
Prof. Alvin Su - SCREAM Lab, NCKU - Thesis director
Dr. Axel Roebel - IRCAM - Thesis director
In this thesis, we propose a new analysis-synthesis framework for environmental sounds and sound textures. It uses a parametric representation of sound textures by means of perceptually important statistics and an efficient mechanism to adapt statistics in the time-frequency domain. The statistic description is based on the short-time-Fourier-transform. The adaptation of statistics is achieved by utilizing the connection between the statistics on time-frequency representation and the spectra of time-frequency domain coefficients. If the order of statistics is not greater than two, feasible signals can directly be generated from statistical descriptions without iterative steps. When the order of statistics is greater than two, the algorithm can still adapt all the statistics within a reasonable amount of iterations.
The proposed framework allows easily extracting the statistical description of a sound texture then resynthesizes arbitrary long samples of the original sound texture from the statistical description.
A perceptual evaluation has shown that the quality of resynthesised sounds is at least as good as state-of-the-art but more efficient in terms of computation time.
Do you notice a mistake?