9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

HMM-Based Finnish Text-to-Speech System Utilizing Glottal Inverse Filtering

Tuomo Raitio (1), Antti Suni (2), Hannu Pulakka (1), Martti Vainio (2), Paavo Alku (1)

(1) Helsinki University of Technology, Finland; (2) University of Helsinki, Finland

This paper describes an HMM-based speech synthesis system that utilizes glottal inverse filtering for generating natural sounding synthetic speech. In the proposed system, speech is first parametrized into spectral and excitation features using a glottal inverse filtering based method. The parameters are fed into an HMM system for training and then generated from the trained HMM according to text input. Glottal flow pulses extracted from real speech are used as a voice source, and the voice source is further modified according to the all-pole model parameters generated by the HMM. Preliminary experiments show that the proposed system is capable of generating natural sounding speech, and the quality is clearly better compared to a system utilizing a conventional impulse train excitation model.

Full Paper

Bibliographic reference.  Raitio, Tuomo / Suni, Antti / Pulakka, Hannu / Vainio, Martti / Alku, Paavo (2008): "HMM-based Finnish text-to-speech system utilizing glottal inverse filtering", In INTERSPEECH-2008, 1881-1884.