9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Evaluation of Finnish Unit Selection and HMM-Based Speech Synthesis

Hanna Silen (1), Elina Helander (1), Jani Nurminen (2), Moncef Gabbouj (1)

(1) Tampere University of Technology, Finland; (2) Nokia Devices R&D, Finland

Unit selection and hidden Markov model (HMM) based synthesis have become the dominant techniques in text-to-speech (TTS) research. In this work, we combine HMM-based signal generation with the front end originally designed for unit selection based Finnish TTS and we evaluate the prosody of the output generated by the two synthesis techniques using the same speech database. Furthermore, we study the effect that the training set size has for the prosody and intelligibility in HMM-based synthesis. The results indicate that the HMM-based approach is capable of providing better prosody than unit selection even if the training set size is severely limited. The size of the training set, however, affects the prosodic quality and intelligibility of the HMM-based synthesizer.

Full Paper

Bibliographic reference.  Silen, Hanna / Helander, Elina / Nurminen, Jani / Gabbouj, Moncef (2008): "Evaluation of Finnish unit selection and HMM-based speech synthesis", In INTERSPEECH-2008, 1853-1856.