Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Predicting the Perceptual Confusion of Synthetic Plosive Consonants in Noise

James J. Hant and Abeer Alwan

Dept. of Electrical Engineering, UCLA, Los Angeles, CA, USA

In previous work, a novel, time/frequency detection model was developed based on psychoacoustic masking experiments and used to predict the noise masking of speech-like bursts and formant transitions [5]. In this paper, the same model is used to predict the discrimination of voiced synthetic plosive consonants in a variety of noisy environments. Discrimination experiments were conducted using synthetic /bV/, /dV/, and /gV/ syllables and two different additive noise maskers (speechshaped and perceptually-flat). Experiments were conducted across three vowel contexts (/a/, /i/, and /u/) using CV syllables both with and without a noise burst.

Results show that discrimination thresholds are largely dependent on the noise masker, vowel context, and plosive consonant. For all experimental conditions, the addition of the burst has little effect on thresholds, suggesting that the perception of plosive consonants in noise is dominated by the formant transition cue.

The previously derived, time/frequency detection model was then used to predict the perceptual data. The model is successful in predicting most of the results, but overpredicts discrimination thresholds for /bi/ and /di/.


Full Paper

Bibliographic reference.  Alwan, James J. Hant and Abeer (2000): "Predicting the perceptual confusion of synthetic plosive consonants in noise", In ICSLP-2000, vol.3, 941-944.