10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Online Model Adaptation for Voice Conversion Using Model-Based Speech Synthesis Techniques

Dalei Wu (1), Baojie Li (1), Hui Jiang (1), Qian-Jie Fu (2)

(1) York University, Canada
(2) House Ear Institute, USA

In this paper, we present a novel voice conversion method using model-based speech synthesis that can be used for some applications where prior knowledge or training data is not available from the source speaker. In the proposed method, training data from a target speaker is used to build a GMM-based speech model and voice conversion is then performed for each utterance from the source speaker according to the pre-trained target speaker model. To reduce the mismatch between source and target speakers, online model adaptation is proposed to improve model selection accuracy, based on maximum likelihood linear regression (MLLR). Objective and subjective evaluations suggest that the proposed methods are quite effective in generating acceptable voice quality for voice conversion even without training data from source speakers.

Full Paper

Bibliographic reference.  Wu, Dalei / Li, Baojie / Jiang, Hui / Fu, Qian-Jie (2009): "Online model adaptation for voice conversion using model-based speech synthesis techniques", In INTERSPEECH-2009, 1643-1646.