We describe an approach to simulate different phonation types, following John Laverís terminology, by means of a hybrid (rulebased and unit concatenating) formant synthesizer. Different voice qualities were generated by following hints from the literature and applying the revised KLGLOTT88 model. Within a listener perception experiment, we show that the phonation types get distinguished by the listeners and lead to emotional impression as predicted by literature. The synthesis system and its source code, as well as audio samples can be downloaded at http://emoSyn.syntheticspeech.de/.
Full Paper Multimedia Files
Bibliographic reference. Burkhardt, Felix (2009): "Rule-based voice quality variation with formant synthesis", In INTERSPEECH-2009, 2659-2662.