11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Dynamic Model Selection for Spectral Voice Conversion

Pierre Lanchantin, Xavier Rodet

IRCAM, France

Statistical methods for voice conversion are usually based on a single model selected in order to represent a tradeoff between goodness of fit and complexity. In this paper we assume that the best model may change over time, depending on the source acoustic features. We present a new method for spectral voice conversion called Dynamic Model Selection (DMS), in which a set of potential best models with increasing complexity - including a mixture of Gaussian and probabilistic principal component analyzers - are considered during the conversion of a source speech signal into a target speech signal. This set is built during the learning phase, according to the Bayes information criterion (BIC). During the conversion, the best model is dynamically selected among the models in the set, according to the acoustical features of each source frame. Subjective tests show that the method improves the conversion in terms of proximity to the target and quality.

Full Paper

Bibliographic reference.  Lanchantin, Pierre / Rodet, Xavier (2010): "Dynamic model selection for spectral voice conversion", In INTERSPEECH-2010, 1720-1723.