The new generation of text-to-speech systems needs the ability to control the voice quality of the synthesized speech by varying the excitation source. This feature is fundamental to improve naturalness and to synthesize female or child voices. The variation of the voice quality is also important when trying to synthesize expression. The problem involves two aspects: the ability to control the source parameters of the speech synthesizer and the possibility of extracting these parameters from natural speech. This paper describes a source model based on the poly-nomial model for the glottal flow suggested by Rosenberg  that has an exact representation in the frequency domain, and an automated procedure to estimate its parameters from natural speech.
Bibliographic reference. Oliveira, Luis C. (1993): "Estimation of source parameters by frequency analysis", In EUROSPEECH'93, 99-102.