INTERSPEECH 2007
8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Utterance-Final Glottalization as a Cue for Familiar Speaker Recognition

Tamás Böhm (1), Stefanie Shattuck-Hufnagel (2)

(1) BME, Hungary; (2) MIT, Cambridge, MA, USA

Several studies have reported systematic differences across speakers in the rate and type of intermittent irregular vocal fold vibration (glottalization). Still, it remains an open question whether human listeners use this speaker-specific information as a cue for recognizing familiar voices. A perceptual experiment was conducted to investigate this issue, concentrating on irregularity in utterance-final position. A novel method was employed to manipulate the final voice quality (in our case, modal or glottalized). Listeners, who were familiar with the voices of the speakers, were presented pairs of speech samples: one with the original and another with manipulated final voice quality. When listeners were asked to select the member of the pair that was closer to the talker's voice, they chose the unmanipulated token in 63% of the trials. This result suggests that irregular pitch periods in utterance-final regions play a role in the recognition of individual speaker voices.

Full Paper

Acoustic Material

Fig1a.wav
Fig1b.wav
Fig1c.wav
Fig1d.wav 
The sound files contain the four stimuli shown on Figure 1 in the paper. There are examples of both unmanipulated recordings with glottalized (a) and modal (c) endings and their corresponding manipulated versions created by concatenation (b) and cycle removal (d).

Bibliographic reference.  Böhm, Tamás / Shattuck-Hufnagel, Stefanie (2007): "Utterance-final glottalization as a cue for familiar speaker recognition", In INTERSPEECH-2007, 2657-2660.