ISCA Archive SSW 2010
ISCA Archive SSW 2010

An unified and automatic approach of Mandarin HTS system

Yong Guan, Jilei Tian, Yi-Jian Wu, Junichi Yamagishi, Jani Nurminen

Most studies on Mandarin HTS (HMM-based text-to-speech system) have taken the initial/final as the basic acoustic units. It is, however, challenging to develop a multilingual HTS in a uniformed and consistent way since most of other languages use the phoneme as the basic phonetic unit. It becomes hard to apply cross-lingual adaptation which need map phonemes from each other, particularly in the case of unified ASR and HTS system due to the phoneme nature of most of the ASR systems. In this paper, we propose a phoneme based Mandarin HTS system, which has been systematically evaluated by comparing it with the initial/final system. The experimental results show that the use of phoneme as the acoustic unit for Mandarin HTS is a promising unified approach, thus enabling better and more uniform development with other languages while significantly reducing the number of acoustic units. The flat-start training scheme is also evaluated to show that the phoneme segmentation problem is solved without any performance degradation for phoneme based Mandarin HTS system. This performs an automatic approach without dependency with particular ASR system.

Index Terms: speech synthesis, Mandarin HTS, flat-start training, speaker adaptation

Cite as: Guan, Y., Tian, J., Wu, Y.-J., Yamagishi, J., Nurminen, J. (2010) An unified and automatic approach of Mandarin HTS system. Proc. 7th ISCA Workshop on Speech Synthesis (SSW 7), 236-239

  author={Yong Guan and Jilei Tian and Yi-Jian Wu and Junichi Yamagishi and Jani Nurminen},
  title={{An unified and automatic approach of Mandarin HTS system}},
  booktitle={Proc. 7th ISCA Workshop on Speech Synthesis (SSW 7)},