ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

An overview of nitech HMM-based speech synthesis system for blizzard challenge 2005

Heiga Zen, Tomoki Toda

In the present paper, hidden Markov model (HMM) based speech synthesis system developed in Nagoya Institute of Technology (Nitech-HTS) for a competition of text-to-speech synthesis systems using the same speech databases, named Blizzard Challenge 2005, is described. We show an overview of the basic HMM-based speech synthesis system and then recent developments to the latest one such as STRAIGHT-based vocoding, hidden semi-Markov model (HSMM) based acoustic modeling, and parameter generation considering global variance are illustrated. Constructed voices can synthesize speech around 0.3 xRT (real time ratio) and their footprints are less than 2 MB. The listening test results show that performances of our systems are much better than we expected.


doi: 10.21437/Interspeech.2005-76

Cite as: Zen, H., Toda, T. (2005) An overview of nitech HMM-based speech synthesis system for blizzard challenge 2005. Proc. Interspeech 2005, 93-96, doi: 10.21437/Interspeech.2005-76

@inproceedings{zen05_interspeech,
  author={Heiga Zen and Tomoki Toda},
  title={{An overview of nitech HMM-based speech synthesis system for blizzard challenge 2005}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={93--96},
  doi={10.21437/Interspeech.2005-76}
}