Toward multiple-language TTS: experiments in English and Mandarin

Raul Fernandez, Wei Zhang, Ellen Eide, Raimo Bakis, Wael Hamza, Yi Liu, Michael Picheny, John F. Pitrelli, Yong Qing, Zhi Wei Shuang, Li Qin Shen

Text-to-speech systems have dramatically improved in recent years through the use of corpus-based concatenative approaches, and we are beginning to see an interest in endowing them with the ability to handle more than the native language for which they have been developed. In this paper we present ongoing work at IBM in text-to-speech systems that can produce high-quality synthesis in more than one language. We illustrate the discussion with a case study in which two systems, originally developed to support English and Mandarin respectively, have been extended to support each other's languages. We describe the challenges faced when adapting one system to a different target language, propose adaptation solutions, and present the results of perceptual tests carried out to evaluate how the approaches compare with the performance of the native systems.

