5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Dynamic vs. Static Spectral Detail in the Perception of Gated Stops

Michael Kiefte, Terrance M. Nearey

University of Alberta, Canada

In order to assess the importance of dynamic spectral information within the first few milliseconds following oral release for the identification of prevocalic stop consonants, 23.75 ms gated CV syllables were presented to listeners for identification. In addition to these, subjects were presented with the same tokens reconstructed from their minimum phase decomposition such that they have the same long-term power spectrum as their original counterparts, but with differing internal dynamic spectral detail. Subjects' results from this experiment were then modelled with logistic regression analysis using mel cepstral coefficients with and without dynamic spectral information encoded in order to demonstrate the effect that reduced temporal information has in the context of automatic classification. Preliminary results from this experiment show that some dynamic spectral detail is used by listeners even for very short stimuli. We conclude that models of speech perception must take spectral variation over very short time frames into account.

Full Paper
Sound Examples: Original    Minimum phase reconstruction

Bibliographic reference.  Kiefte, Michael / Nearey, Terrance M. (1998): "Dynamic vs. static spectral detail in the perception of gated stops", In ICSLP-1998, paper 0898.