8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Testing the Relevance of Speech Rate, Pitch and a Glottal Chink for the Perception of Age in Synthesized Speech Using Formant Synthesis

Ralf Winkler

Technische Universität Berlin, Germany

Listeners are able to rate a speaker's age with reasonable accuracy. However, it is still controversial which features reliably signal a speaker's age. This paper presents results of a synthesis study, where speech rate, pitch, and a glottal chink were varied systematically over a range that effectively occurs in natural speech to shift the mean perceived age.

The strongest impact on age judgements was found for (i) speech rate, followed by (ii) the glottal chink, while the impact of pitch was only marginal. Some interactions (iii) between the parameters were observed as well.

Results regarding (i) and (ii) show, that formant synthesis is capable of producing speech considerably varying in its mean perceived age even if only a small number of features are manipulated. Regarding (iii), results indicate, that in the study of the impact of selected features their interactions should be considered too.

Bibliographic reference.  Winkler, Ralf (2007): "Testing the relevance of speech rate, pitch and a glottal Chink for the perception of age in synthesized speech using formant synthesis", In INTERSPEECH-2007, 2653-2656.