Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

An Overview of Nitech HMM-Based Speech Synthesis System for Blizzard Challenge 2005

Heiga Zen, Tomoki Toda

Nagoya Institute of Technology, Japan

In the present paper, hidden Markov model (HMM) based speech synthesis system developed in Nagoya Institute of Technology (Nitech-HTS) for a competition of text-to-speech synthesis systems using the same speech databases, named Blizzard Challenge 2005, is described. We show an overview of the basic HMM-based speech synthesis system and then recent developments to the latest one such as STRAIGHT-based vocoding, hidden semi-Markov model (HSMM) based acoustic modeling, and parameter generation considering global variance are illustrated. Constructed voices can synthesize speech around 0.3 xRT (real time ratio) and their footprints are less than 2 MB. The listening test results show that performances of our systems are much better than we expected.

Full Paper

Bibliographic reference.  Zen, Heiga / Toda, Tomoki (2005): "An overview of nitech HMM-based speech synthesis system for blizzard challenge 2005", In INTERSPEECH-2005, 93-96.