10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Alleviating the One-to-Many Mapping Problem in Voice Conversion with Context-Dependent Modeling

Elizabeth Godoy (1), Olivier Rosec (1), Thierry Chonavel (2)

(1) Orange Labs, France
(2) Telecom Bretagne, France

This paper addresses the “one-to-many” mapping problem in Voice Conversion (VC) by exploring source-to-target mappings in GMMbased spectral transformation. Specifically, we examine differences using source-only versus joint source/target information in the classification stage of transformation, effectively illustrating a “one-to-many effect” in the traditional acoustically-based GMM. We propose combating this effect by using phonetic information in the GMM learning and classification. We then show the success of our proposed context-dependent modeling with transformation results using an objective error criterion. Finally, we discuss implications of our work in adapting current approaches to VC.

Full Paper

Bibliographic reference.  Godoy, Elizabeth / Rosec, Olivier / Chonavel, Thierry (2009): "Alleviating the one-to-many mapping problem in voice conversion with context-dependent modeling", In INTERSPEECH-2009, 1627-1630.